scraper.coffeesrc/ | |
---|---|
External dependencies | whenjs = require 'when'
unfoldList = require 'when/unfold/list'
cheerio = require 'cheerio' |
Internal dependencies | DATA_FEED_URLS = require './data_urls'
Parser = require './parser'
Downloader = require './downloader' |
Pandata | |
This is the main class of the module, which exposes the API. Use like so:
| module.exports = class Pandata
constructor: (@webname) ->
@parser = new Parser |
getGet a Pandora webname by searching for a string (such as an email address), and execute a callback on it. The callback must have the signature | @get: (user_id, callback) ->
search_url = DATA_FEED_URLS.user_search.replace('%{searchString}', user_id)
Downloader.read_page search_url, (data) ->
$ = cheerio.load data
webnames = Parser.get_webnames_from_search($)
if user_id in webnames
callback null, user_id
else if webnames.length is 1 and /.*@.*\..*/.test user_id
callback null, webnames[0]
else if webnames? and webnames[0] isnt undefined
callback null, webnames[0]
else
callback Error("""[Pandata] Couldn't find a Pandora user with that
email or webname."""), null |
API | |
recent_activityGet a list of artists and tracks a user has listened to recently. Returns a promise for an array of tracks and artists. A track
is an object with
| recent_activity: ->
@_scrape_for('recent_activity', 'get_recent_activity') |
playing_stationGet the user's currently playing station. Get the station a user is currently playing (this means the one that is currently selected for playback in Pandora; the user may not actually be on pandora.com listening to it.) Returns a promise for the name of the currently playing station as a string:
| playing_station: ->
@_scrape_for('playing_station', 'get_playing_station')
.then (result) -> return result[0] |
stationsGet the stations a user has listened to or created. Returns a promise for an array of station names:
| stations: ->
@_scrape_for('stations', 'get_stations') |
bookmarksGet tracks and artists a user has bookmarked. Returns a promise for an object of
Also accepts an optional string argument to limit the
result to a particular category, either For example,
| bookmarks: (bookmark_type = 'all') ->
switch bookmark_type
when 'tracks'
return @_scrape_for('bookmarked_tracks', 'get_bookmarked_tracks')
when 'artists'
return @_scrape_for('bookmarked_artists', 'get_bookmarked_artists')
when 'all' |
Wait for all the scraping promises to resolve before combining them into the returned object | whenjs.all([
@bookmarks('artists')
@bookmarks('tracks')
]).then(
(results) ->
artists: results[0]
tracks: results[1]
(reason) ->
console.error reason
) |
likesGet tracks, artists, stations, and albums a user has liked. Returns a promise for an object containing arrays of tracks, artists, stations, and albums:
Also accepts an optional string argument to limit the
result to a particular category, either For example,
| likes: (like_type = 'all') ->
switch like_type
when 'tracks'
@_scrape_for('liked_tracks', 'get_liked_tracks')
when 'artists'
@_scrape_for('liked_artists', 'get_liked_artists')
when 'stations'
@_scrape_for('liked_stations', 'get_liked_stations')
when 'albums'
@_scrape_for('liked_albums', 'get_liked_albums')
when 'all' |
Wait for all the scraping promises to resolve before combining them into the returned object | whenjs.all([
@likes('artists')
@likes('albums')
@likes('stations')
@likes('tracks')
]).then(
(results) ->
artists: results[0]
albums: results[1]
stations: results[2]
tracks: results[3]
(reason) ->
console.error reason
) |
followingGet the Pandora users that follow this user. Returns a promise for an array of user objects:
| following: ->
@_scrape_for('following', 'get_following') |
followersGet the Pandora users this user is following. Returns a promise for an array of user objects:
| followers: ->
@_scrape_for('followers', 'get_followers') |
Private methods | |
Downloads all data of a given type and calls the supplied
Returns a promise for the array of results. | _scrape_for: (data_type, parser_method) -> |
This is called iteratively by | unspool = (next_data_indices) => |
Must return a promise | deferred = whenjs.defer() |
We'll give the resolver to the | resolver = deferred.resolver
url = @_get_url data_type, next_data_indices
Downloader.read_page url, (data) => |
Check if we're getting XML, use | $ = cheerio.load data, (if /\.xml/.test url then {xmlMode: yes} else {}) |
Pass the parsed DOM object to the | result = @parser[parser_method]($)
next_data_indices = @parser.get_next_data_indices($) |
| return resolver.resolve([result, next_data_indices])
return deferred.promise |
The condition upon which | condition = (next_data_indices) =>
return (if next_data_indices? then \
@_is_empty next_data_indices else next_data_indices) |
The initial seed for | initial_data_indices = null
seed = initial_data_indices |
| return unfoldList(unspool, condition, seed)
.then( |
Resolution | (results) -> |
Flatten the resulting array | return [].concat results... |
Rejection | (reason) ->
console.error reason
) |
Grab a URL from | _get_url: (data_type, next_data_indices = null) ->
unless next_data_indices?
next_data_indices =
nextStartIndex: 0
nextLikeStartIndex: 0
nextThumbStartIndex: 0 |
! We want to set the webname parameter as well | next_data_indices['webname'] = @webname |
! Grab the proper URL | url = DATA_FEED_URLS[data_type] |
! Replace the parameters with values | for url_string_param of next_data_indices
url = url.replace(
new RegExp("%{"+url_string_param+"}")
, next_data_indices[url_string_param])
return url |
Utility method to check if an object is empty or not. | _is_empty: (obj) ->
if obj is null then return yes
if obj.length and obj.length > 0 then return no
if obj.length is 0 then return yes
for key of obj
if hasOwnProperty.call(obj, key) then return no
return yes
|