Alpha version of new Schedules Direct lineup information

Discussion about Schedules Direct grabber code and data formats.
rkulagow
SD Staff
Posts: 917
Joined: Tue Aug 14, 2007 3:15 pm

Alpha version of new Schedules Direct lineup information

Post by rkulagow »

Currently, data downloaded by our subscribers comes directly from Tribune Media Services and is hosted on their servers.

If we self-host, there are things which we can do to augment the data.

I have attached two examples of what the new data format will look like (it's JSON); the files specify two headends.

The IL57303 file contains the analog and the "X" (digital) lineup; it also contains QAM tuning data for this particular headend. Applications may take advantage of this information to make initial configuration of QAM easier for the end user.

The new format now contains a version number and a last-modified-date for each lineup. Applications may take advantage of these to let the user know that the lineup has been updated and take appropriate action.

If QAM information is available, it will be included.

Switching to this format means that users do not need to edit lineups on the Schedules Direct website to enable / disable channels. If the user / application doesn't want schedule data for a particular channel, don't download the schedule file for that XMLID. Each XMLID has a separate URL and contains two weeks of schedule information, so the new format may not be appropriate for low-power machines. However, because each XMLID file is available as a separate entity, an application can choose to download each individually and process the file before moving to the next one to spread processing load over a longer time interval.

The "fta" file contains information for free-to-air satellites visible in the Western Hemisphere. The exact information required by FTA is still being reviewed, so there are some gaps.

An example of an XMLID file will be posted on 2012-08-23; it's also JSON and contains a superset of the information currently available.

Patches have been created for MythTV to use this data; a link will be provided on or about 2012-08-23.

At this time this is extremely alpha code, and the format is still subject to change. Post here if you have questions, or have suggestions on what should be included in the new data format.

rkulagow
SD Staff
Posts: 917
Joined: Tue Aug 14, 2007 3:15 pm

Re: Alpha version of new Schedules Direct lineup information

Post by rkulagow »

Just as an update, the format is being modified slightly to allow for more flexibility for people outside of North America, where digital transmission may involve polarization.

There will also be additional meta data available about stations:

{"market":"Chicago","country":"United States","callsign":"WBBM","marketrank":"3","name":"WBBM","affiliate":"CBS Affiliate","zipcode":"60611","state":"IL","city":"Chicago","freq":"2"}
{"market":"Detroit","country":"United States","callsign":"WPXD","marketrank":"11","name":"WPXD","affiliate":"ION Affiliate","zipcode":"48076","state":"MI","city":"Southfield","freq":"31"}

So in the above examples, applications will be able to provide hints to users regarding network affiliation, such as "ION" or "CBS" if it's not immediately apparent from the call sign.

rkulagow
SD Staff
Posts: 917
Joined: Tue Aug 14, 2007 3:15 pm

Re: Alpha version of new Schedules Direct lineup information

Post by rkulagow »

An alpha grabber and additional information on the format of the data can now be obtained from github.

Code: Select all

$ git clone git://github.com/rkulagowski/tv_grab_na_sd.git
There is an example directory containing over-the-air antenna, cable headend, and free-to-air satellite lineups, and 14 days of schedule information for one station.

Patches to MythTV will be released post MythTV 0.26.

Slugger
Posts: 77
Joined: Sun Sep 18, 2011 1:22 pm

Re: Alpha version of new Schedules Direct lineup information

Post by Slugger »

Even though it's labelled alpha, I assume since it's public that it's at least safe enough to go ahead and use this lineup info feed (and implement it) as long as I stay on top of any changes, which I assume you'll post in this thread as they are made?

rkulagow
SD Staff
Posts: 917
Joined: Tue Aug 14, 2007 3:15 pm

Re: Alpha version of new Schedules Direct lineup information

Post by rkulagow »

Slugger wrote:Even though it's labelled alpha, I assume since it's public that it's at least safe enough to go ahead and use this lineup info feed (and implement it) as long as I stay on top of any changes, which I assume you'll post in this thread as they are made?
If you'd like to participate, please send an email to grabber@schedulesdirect.org so that I can get you setup on the test server.

Right now the code to import into MythTV is still being worked on, and the format itself is being updated as certain things are discovered, debugged.

rkulagow
SD Staff
Posts: 917
Joined: Tue Aug 14, 2007 3:15 pm

Re: Alpha version of new Schedules Direct lineup information

Post by rkulagow »

The wiki page at github for the grabber now provides details for the various fields.

rkulagow
SD Staff
Posts: 917
Joined: Tue Aug 14, 2007 3:15 pm

Re: Alpha version of new Schedules Direct lineup information

Post by rkulagow »

Programs now include metadata from thetvdb; we have a 46% "hit" rate at this time.

{"descr_60":"The Grammy Award-winning folk group Bon Iver perform.","title_10":"Austin","title_20":"Austin City Limits","descr2_100":"The Grammy Award-winning folk group Bon Iver perform.","series_description":"Country musicians perform.","genres":["Music"],"metadata":[{"seriesid":"71649","episode":"2","datasource":"thetvdb","season":"38"}],"made_for_tv":false,"descr_40":"The folk group Bon Iver perform.","modified":"2012-10-12 13:26:11","descr":"The Grammy Award-winning folk group Bon Iver perform.","color_code":"Color","title_40":"Austin City Limits","epi_title":"Bon Iver","title":"Austin City Limits","md5":"IlafwUJselWTQLzEraTahQ","alt_title":"","orig_air_date":"2012-10-13","descr2":"The Grammy Award-winning folk group Bon Iver perform.","descr_100":"The Grammy Award-winning folk group Bon Iver perform.","cast_and_crew":["Musical guest: Bon Iver"],"descr_lang_id":"English","source_type":"Network","prog_id":"EP000004390458","alt_syn_epi_num":"","title_70":"Austin City Limits","show_type":"Series","syn_epi_num":"3802","datatype":"program"}

{"descr_60":"Bart fools all into thinking a child fell down a well.","title_10":"Simpsons","title_20":"The Simpsons","descr2_100":"Bart fools the town into thinking a child has fallen down a well.","series_description":"Homer and Marge Simpson raise Bart, Lisa and baby Maggie.","genres":["Sitcom","Animated"],"metadata":[{"seriesid":"71663","episode":"13","datasource":"thetvdb","season":"3"}],"made_for_tv":false,"descr_40":"Bart lies about a child in a well.","modified":"2012-10-12 13:26:11","descr":"Bart fools the town into thinking a child has fallen down a well.","color_code":"Color","title_40":"The Simpsons","epi_title":"Radio Bart","title":"The Simpsons","md5":"+zA+MDea67Fu0SU0YTH/9Q","alt_title":"","orig_air_date":"1992-01-09","descr2":"Bart fools the town into thinking a child has fallen down a well.","descr_100":"Bart fools the town into thinking a child has fallen down a well.","cast_and_crew":["Actor:Dan Castellaneta","Actor:Julie Kavner","Actor:Nancy Cartwright","Actor:Yeardley Smith","Actor:Harry Shearer","Actor:Hank Azaria","Executive Producer:James L. Brooks","Executive Producer:Matt Groening","Executive Producer:David Mirkin"],"descr_lang_id":"English","source_type":"Syndicated","prog_id":"EP000186930001","alt_syn_epi_num":"8F11","title_70":"The Simpsons","show_type":"Series","syn_epi_num":"3ABF11","datatype":"program"}


The algorithm currently checks the name of the program and the episode title (subtitle) to determine the season / episode information from thetvdb. If the seriesid is incorrect though, please send email to grabber@schedulesdirect.org with the prog_id and the proper seriesid.

rkulagow
SD Staff
Posts: 917
Joined: Tue Aug 14, 2007 3:15 pm

Re: Alpha version of new Schedules Direct lineup information

Post by rkulagow »

The new API has been entirely converted to use JSON. Please see details and examples at

https://github.com/rkulagowski/tv_grab_na_sd/wiki

hall5714
Posts: 20
Joined: Thu Feb 07, 2013 4:34 pm

Re: Alpha version of new Schedules Direct lineup information

Post by hall5714 »

A few questions:

(1) Are there plans to allow downloading schedules less than 14 days?
(1a) Would an option to download the lineups, programs and schedules as a single file (gziped or zipped) be feasible (over lots of zipped text files)?
(3) Is it possible to move the TVDB info to the schedule, instead of the program?

The first because 14 days at once leaves the possibility of data getting stale (schedule changes), and it would make part 1a feasible (obviously 14 days of schedule data for all stations would simply be too much for most clients to handle). The third, because, frankly, much of the SD program data and TVDB series/episode data are the same thing, so it seems prudent to only download the data from a single source.

The third is more a thought than a suggestion, but 1 and 1a would be huge (being able to get a single gzip of the entire days schedule in one run). I realize this would result in more HTTP requests, but you can always implement a "wait time" between requests to offset this. To me, in place processing of the smaller chunks of data (ie, 1 to 2 days) would make things much easier... or providing an updates.zip similar to TVDB would work as well.

rkulagow
SD Staff
Posts: 917
Joined: Tue Aug 14, 2007 3:15 pm

Re: Alpha version of new Schedules Direct lineup information

Post by rkulagow »

hall5714 wrote:A few questions:

(1) Are there plans to allow downloading schedules less than 14 days?
No. The schedule file itself is fairly lightweight, since it doesn't contain any real program details. The program files are what would make up the bulk of the transfer, and if there are a lot of repeats on a schedule then you only download the prog_id a single time.
hall5714 wrote: (1a) Would an option to download the lineups, programs and schedules as a single file (gziped or zipped) be feasible (over lots of zipped text files)?
That was investigated, but it took way too long to assemble everything. Since the client is supposed to download the schedule first, determine which programs it needs and then request only those program, we can't really bundle into one file.
hall5714 wrote: (3) Is it possible to move the TVDB info to the schedule, instead of the program?
Maybe, if you make your case strong enough.
hall5714 wrote: The first because 14 days at once leaves the possibility of data getting stale (schedule changes), and it would make part 1a feasible (obviously 14 days of schedule data for all stations would simply be too much for most clients to handle). The third, because, frankly, much of the SD program data and TVDB series/episode data are the same thing, so it seems prudent to only download the data from a single source.
You won't get stale data. Each time you get the sched file for a stationID, it will contain the next 12-14 days worth of programs that are on the stationID. If any program gets updated (say a new guest star is added, or metadata is updated, or whatever) on "today + 4 days", then while the prog_id stays the same, the MD5 will change. The client should look for that. In the same way, if you've downloaded a program and it's in your database, and the program comes up again on the schedule, but the MD5 is the same, then you don't need to download it again.

The schedules are pre-generated, so no, can't make them a shorter duration; each stationID will have the next 12-14 days.

Looking at a recent request from a beta client earlier today as an example, the server put together a zip file with 17492 programs in it in 25 seconds and then send it out in 4 seconds.

There's nothing that says that the client has to pull everything all at once; if you're in a low memory / low CPU situation, then download one stationID at a time, figure out which programs you need, and download them. If you want to only download 200 at a time instead of 17000, then only put 200 prog_id's in the request.

Post Reply