Skip to content

sebastiannicolajsen/dblp-fetcher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dblp-fetcher

A repository for fetching articles and skimming articles related to specific venues (journals and conferences) via dbpl using a custom scraper as the existing api is lacking.

To execute:

npm install
npm start [venue_1] [venue_2] ... [venue_n]

Hereafter the script will fetch the articles and allow you to navigate them using the following:

[F]orward*, [B]ackward*, [O]pen, [L]abel, [C]omment, [S]ave, [G]o  (*hold shift to `jump`)

(Comments are finished by pressing enter)

(Jump moves you to the next proceedings or workshop or Conference in the list. Note that these titles are included when fetching papers. Works nicely for keeping track of progress)

Furthermore, the script can be run with different flags (remember to always add -- after venue specification to enable flags):

--silent (disables automatic opening when navigating using F/B)
--save [filename] (saves the fetched articles in json format for further treatment / reload)
--file [filename] (utilises a specific file instead of fetching anew)
--wrapper [url] (wraps the url when using [O]pen, can be utilised for google searches)
--only-label (removes all inserts which do not have a label on them)

Filters

Furthermore the script utilises filters to remove specific entries. These can be specified in ./src/filters.js. Currently, the only filter there removes articles with the word transcript in them. This is run after the initial load as defined by its initial process tag. Other tags occur as a result of running different handlers.

Handlers

The handler chain specifies processing of the input data after fetching. The order (and used) handlers are specified in ./src/handles/index.js. The specification of how these should be defined is specified in the same file.

Furthermore, some handles may only run if a given flag is set (as is the case with the url-wrapper handle which only runs with the --wrapper [url] flag). I encourage looking at the url-wrapper for a simple example.

Included handlers

open-browser - opens all addresses if included, passes forward the input object. [process_tag: open_browser]

abstract-fetcher - uses cross ref to download and append xml information about the publication using the DOI. This MAY contain the abstract but this is often not the case. [process_tag: fetch_abstract]

url-wrapper - wraps the address field of the input and returns this modified object [process_tag: wrapped_url]

iterate-open-save - allows the default controls introduced before and consumes the input (should be last). [process_tag: iterate_open_save]

About

A repository for fetching articles

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published