Bound for memory usage #81

raspooti · 2014-10-21T07:29:38Z

First, thanks a lot for the great tool. I've been trying it out, and seems magic (except for some corner cases, websites for which it doesn't work, etc) but really cool :)

However, I tried it in a setting with scarce ressources (1G of RAM), and I have the impression that the memory keeps growing build after build until ... memory error. I deactivated the memoize articles, tried to empty the articles, dereference the sources, but looks like a bunch of other things are also memoized, and kept in memory, with no means to deactivate them. What is the best way to handle this? How does newspaper handle the increase of memory usage build after build? Is there a limit?

Thanks again for the magic tool :)
raspooti

raspooti · 2014-10-23T10:48:55Z

Hi, it's me again :)
If you wrap your python script into a shell script and do your loop to scrape news with shell, the python newspaper script is initiated each time and the memory usage can be kept bounded. But it's not practical, I wish there was a way to do it all the way in python.

Or there's something I'm missing :) (Actually, there's a post on stackoverflow on this very same issue...)

codelucas · 2014-11-17T17:04:13Z

@raspooti there is no way around this at the moment.

I tried it in a setting with scarce ressources (1G of RAM), and I have the impression that the memory keeps growing build after build until ... memory error.

Yeah, things can be improved but what do you expect, this tool has features to downloads tons of articles along with their related data, it's bound to consume a lot of memory. This is especially true if you keep everything in python and keep growing the memory used.

Until a better solution comes up you can wrap a new shell script for every 1000 articles or something and run them on cron (not in parallel)

codelucas closed this as completed Nov 17, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bound for memory usage #81

Bound for memory usage #81

raspooti commented Oct 21, 2014

raspooti commented Oct 23, 2014

codelucas commented Nov 17, 2014

Bound for memory usage #81

Bound for memory usage #81

Comments

raspooti commented Oct 21, 2014

raspooti commented Oct 23, 2014

codelucas commented Nov 17, 2014