Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed array with single or multiples entries on json extractor #143

Merged
merged 2 commits into from
Nov 22, 2021
Merged

Fixed array with single or multiples entries on json extractor #143

merged 2 commits into from
Nov 22, 2021

Conversation

felipehertzer
Copy link
Contributor

This pR fixes the issue found #140 where the @graph and liveBlogUpdate item can be an array with only one item or an array with multiple items.

Single Item with different structure:

"liveBlogUpdate": {
      "@type":"BlogPosting",
      "headline":"Coming this April, HBO NOW will be available exclusively in the U.S. on Apple TV and the App Store.",
      "datePublished":"2015-03-09T13:08:00-07:00",
      "articleBody": "It's $14.99 a month.<br> And for a limited time, …"
    }

Multiple items with array:

"liveBlogUpdate":[
    {
      "@type":"BlogPosting",
      "headline":"Coming this April, HBO NOW will be available exclusively in the U.S. on Apple TV and the App Store.",
      "datePublished":"2015-03-09T13:08:00-07:00",
      "articleBody": "It's $14.99 a month.<br> And for a limited time, …"
    },
    {
      "@type":"BlogPosting",
      "headline":"iPhone is growing at nearly twice the rate of the rest of the smartphone market.",
      "datePublished":"2015-03-09T13:13:00-07:00",
      "image":"http://images.apple.com/live/2015-mar-event/images/573cb_xlarge_2x.jpg"
    },
  ]

@codecov-commenter
Copy link

Codecov Report

Merging #143 (fbbd55a) into master (7224971) will not change coverage.
The diff coverage is 50.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #143   +/-   ##
=======================================
  Coverage   94.95%   94.95%           
=======================================
  Files          19       19           
  Lines        2714     2714           
=======================================
  Hits         2577     2577           
  Misses        137      137           
Impacted Files Coverage Δ
trafilatura/json_metadata.py 86.32% <50.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7224971...fbbd55a. Read the comment docs.

@adbar
Copy link
Owner

adbar commented Nov 16, 2021

Hi @felipehertzer, thank you for the prompt answer! Could you please add test cases as well?

@adbar adbar linked an issue Nov 18, 2021 that may be closed by this pull request
@adbar
Copy link
Owner

adbar commented Nov 22, 2021

@felipehertzer Nevermind, I took the time.

Looks ready to merge.

@adbar adbar merged commit 1c1b6e4 into adbar:master Nov 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash on a specific web page
3 participants