-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError in fast_link_fixer.py #35
Comments
I got the same error for HISTORY - "Q309". I searched this id in the wikidata_ids.txt in "data/wikidata/" and it was not found. This might be the cause because if its not in wikidata_ids.txt then this item is not captured at all. I followed the exact steps as given in README.md. Still getting the above error. @JonathanRaiman - Please help!! |
@heisenbugfix You've hit a rare instance of Wikipedia/Wikidata deprecating or merging articles. In general this happens when an article is deemed spammy or too similar to another article. My suggestion here is to look for "Aspect of History" (what Q17524420 originally stood for) and try to find what it used to point to/was used for and find a good substitute, or remove altogether). |
I have the same problem when I either want to extract data or use the human generated type system:
people
crashes with KeyError for:
I stopped removing after his point, I assume it is a deep rabbit hole. Is there an easy way around this? I really want to use this research, but I sadly cannot. |
@jcklie I suggest either finding an older Wikidata dump to download (no longer faces invalidation issues), or starting from scratch your own ruleset/script. I've had this issue when upgrading to newer wikidatas in the past, and usually there were ~3 broken ones that were usually associated to merges of infrequently used Q-ids. |
@JonathanRaiman Thank you for your quick response. I have another question about it: why does it also crash with "full_preprocess.sh"? I thought that this script automagically collects all the data from scratch while using the newest Wikipedia and Wikidata. |
@jcklie Final step of full_preprocess calls For instance Concerning |
I run The end is
|
I have the same problem and I tried looking at the constructed trie. It looks like a lot of category links and anchor tags are missing:
Most of the entities in Also, it's a stupid question but I couldn't find the place to download the same Wiki dump as mentioned in the paper. Can you point me to it? All the best. |
Hi. Can you tell me how to map Qxxx with an entity with the wikidata we download. I dont find any file to map these things. Please help. Tks |
def load_aucs(): where are these come from??????~~~ |
Could you please tell us that what's the dump file you have used in your paper? we failed in this step, so can not do next step, Thank you very much. |
I have the same issue with "Q2472587" did anyone fix it? |
I am having the same problem. Did anyone manage to fix it?? |
Same problem here with Q20871948 |
When running 'full_process.sh' I seem to get a key error, the exact error message is:
'Traceback (most recent call last):
File "extraction/fast_link_fixer.py", line 594, in
main()
File "extraction/fast_link_fixer.py", line 456, in main
initialize_globals(c)
File "extraction/fast_link_fixer.py", line 101, in initialize_globals
ASPECT_OF_HIST = wkd(c, "Q17524420")
File "extraction/fast_link_fixer.py", line 72, in wkd
return c.name2index[name]
File "/usr/local/lib/python3.5/dist-packages/wikidata_linker_utils/wikidata_ids.py", line 20, in getitem
value = self.marisa[key]
File "src/marisa_trie.pyx", line 577, in marisa_trie.BytesTrie.getitem
KeyError: 'Q17524420''
More specifically it occurs when running the shell script line:
'python3 extraction/fast_link_fixer.py ${DATA_DIR}wikidata ${DATA_DIR}${LANGUAGE}_trie ${DATA_DIR}${LANGUAGE}_trie_fixed'
Would anybody be able to help me with this problem?
The text was updated successfully, but these errors were encountered: