Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quality Starts Stat Missing? #50

Open
dbalders opened this issue Mar 25, 2019 · 4 comments
Open

Quality Starts Stat Missing? #50

dbalders opened this issue Mar 25, 2019 · 4 comments

Comments

@dbalders
Copy link

Hey, when I do pitching_stats I get tons of data. The only stat I can see that is missing is Quality Starts. Do you know how I can get that stat? I see it on the baseball reference page for pitchers, but am not sure how to get it via the tool.

Thank you for your time.

@ttaylor14
Copy link

I am also looking for Quality Starts

@LaSupp
Copy link

LaSupp commented Jan 2, 2020

I am in the same boat. I couldn't find Quality Starts for a pitcher. When browsing the data on Fan Graphs and Baseball Reference I didn't see this metric. I assume another souce would have to be scraped in order to get quality starts.

@blacktj
Copy link

blacktj commented Sep 1, 2020

MLB has a scrape-able option:

import pandas as pd
from bs4 import BeautifulSoup
import requests

def get_quality_starts():

    qs_stop = 1
    page = 0
    df = pd.DataFrame()

    while qs_stop != 0:
        url = 'https://www.mlb.com/stats/pitching/quality-starts?expanded=true&page={}'.format(page)
        qs = pd.read_html(url)[0]['caret-upcaret-downQS']
        soup = BeautifulSoup(requests.get(url).content)
        list_names = [i['aria-label'] for i in BeautifulSoup(str(soup.find('table'))).find_all('a', 'bui-link')]
        temp_df = pd.DataFrame({'Name': list_names, 'QS': qs})
        df = df.append(temp_df)
        qs_stop = min(temp_df['QS'])
        print(page, qs_stop)
        page += 1

    df.reset_index(inplace=True, drop=True)
    return df

Page through the pre-sorted MLB stat page by QS until we hit a page with 0 QS, return a clean DF with name / qs count. This option will scale through a whole season as the current MLB page has 27 pages, but could get slow with that many pages.

Risk is we'd have to join this with Fangraphs data and needs to match names.. After a quick view, looks like Fangraphs and MLB do not use Int'l chars in their dashboards (checked Pablo Lopez, Jose Berrios).

Print statement just for checks..

@TheCleric
Copy link
Contributor

TheCleric commented Sep 3, 2020

The playerid_reverse_lookup function can convert between mlb and fg ids.

MLB id could be extracted from the players' url in the name column's href.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants