🔰 WEB SCRAPING :

What it does 🩹

It extracts every article along with the heading stores it in a New file in a txt format and then does a sentiment analysis of the provided article which has the below > fields

fields ⤵️

"url"
"Positive Sentences"
"Negative Sentences"
"Polarity"
"Subjectivity",
"Average Sentence Length"
"Complex Word Percentage",
"Fog Index"
"Average WordLength"
"Complex Word Count"
"Word Count""Syllable Count"
"Personal Pronouns"

IMPORTANT LIBRARIES TO INSTALL

- beautifulSoup
- Requests
- Pandas
- os
- nltk
- re
- string

INSTALLATION

Clone this repository to your local machine.
Install the required dependencies by running > pip install -r requirements.txt.

CHANGING PATH

- stopword folder
- dict_negative 
- dict_positive 
- inputfile.xlsx
- for storing the created text file for every article scraped

USAGE

FOLLOW THESE STEP

change the paths of the stopwords folder and files

def initialization():
          #paths initialize them according to the location of your data 
          stopword_folder=r"StopWords"  #folder not "file"  
          dictionary_postive=r"positivewords.txt"
          dictionary_negative=r"negativewords.txt"
          
          for filename in os.listdir(stopword_folder):
              with open(os.path.join(stopword_folder, filename), 'r') as file:
                  stopw.update([word.lower() for word in file.read().splitlines()])

change path of input file provide a csv file with urls

def file_open():
 filepath = r"input.xlsx"
 df = pd.read_excel(filepath)
 dataset=list()

SCREENSHOTS

SAMPLE REQUESTS PINGING

SAMPLE OUTPUT

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
output		output
sample_screenshots		sample_screenshots
scripts		scripts
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔰 WEB SCRAPING :

What it does 🩹

fields ⤵️

IMPORTANT LIBRARIES TO INSTALL

INSTALLATION

CHANGING PATH

USAGE

SCREENSHOTS

SAMPLE REQUESTS PINGING

SAMPLE OUTPUT

About

Releases

Packages

Languages

shadowfaxx1/Web-Sraper

Folders and files

Latest commit

History

Repository files navigation

🔰 WEB SCRAPING :

What it does 🩹

fields ⤵️

IMPORTANT LIBRARIES TO INSTALL

INSTALLATION

CHANGING PATH

USAGE

SCREENSHOTS

SAMPLE REQUESTS PINGING

SAMPLE OUTPUT

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages