All Russell 3000 symbols p.2 - Using Programming for Fundamental Investing Part 8



Fundamental Investing Playlist: http://youtu.be/fBEMfugH3OA?list=PLQVvvaa0QuDejNczz7dbpyu3JnwUBvNch This is the eighth video in the series for using programming to aid fundamental investing analysis, showing you how to use programming to get all of the ticker symbols from the Russell 3000 into an array/list. The idea is to use programming to help you find possible and eligible companies for further consideration. This entire series focuses on using programming specifically for value investing, just as an example to use throughout the video. Sentdex.com Facebook.com/sentdex Twitter.com/sentdex

Comments

  1. Excellent series of videos! Just found this and subscribed. Unfortunately, I didn't see this video until after I had already gone through the trouble of writing my own Python script to create a dictionary of tickers. It's just as well, since the URL of the constituent list has changed. It's now here: https://www.ftserussell.com/membership-russell-3000

    I basically just save the PDF and then use the PyPDF2 library to do the heavy lifting for me:
    ```
    import PyPDF2
    import os

    f = os.path.join('C:\\', 'path', 'to', 'file', 'ru3000_membershiplist_20160627.pdf')

    pdf = PyPDF2.PdfFileReader(f)
    constituents = dict()
    for page_number in range(0, pdf.numPages-1):
    page = pdf.getPage(page_number).extractText()
    lines = page.split('\n')
    for pgidx in range(0,len(lines)-6,2):
    if lines[pgidx]=='Company':
    continue
    company = lines[pgidx].strip()
    ticker = lines[pgidx+1].strip()
    constituents[company] = ticker
    ```

    Hope this helps, if someone still needs the constituent list.

    Cheers,
    Matt
  2. Your videos rock.
  3. Hi,

    What if the list was reversed. Say the tickers were first and then the company name after?
  4. Hi Sentdex I'm having a little trouble in this tutorial in that when checking the array of Russell 3000 symbols against our earlier defined criteria (PEG ratio, P/B ratio etc) there are some companies which do not have a PEG5 ratio - take 'ABY' as example which shows PEG5 of N/A on yahoo finance.

    This is causing my yahoo_key_stats function to go the the 'except' block and return following message "failed in the main loop could not convert string to float: N/A".

    I am guessing there is a quick fix/bit of logic which can account for instances of stocks presenting 'N/A' and still output the result for other tickers in the array but am not too sure what it is - your help would be much appreciated :)

    Thanks!
  5. In my case, i have to put ticker = splitLine[-2],weird.. I got it fixed now. Thanks, but I would still like you to address the blank line issue. How come it does not affect python reading the file? Did python just skip the blank line in the text file??
  6. hello again, I copied your script, and copied and pasted the PDF(russell 3000) into the text file, and cleaned it up. However, i did not get the same result as you did. I think the last object in each line is a "space" instead of the ticker..so when i say ticker = eachline [-1], it printed out space, and there are some blank lines in the text file, should i take that into consideration as well?
  7. i've written an simple script based on pdfminer. Probably not the best way to do it, but it workes :)

    #! /usr/bin/python
    '''
    There must be a better way of doing this! This is just my 2 cents of testing my new-found
    Python knowledge and wanting to contribute to Sentdex for his wonderfull Youtube video's!

    Download and install pdfminer
    Download URL: https://pypi.python.org/pypi/pdfminer/
    Install instructions: http://www.unixuser.org/~euske/python/pdfminer/index.html#install '''

    ## Imports
    import os
    import urllib2

    ## Location of Russell3000 pdf
    rusLoc = 'http://www.russell.com/indexes/documents/Membership/Russell3000_Membership_List.pdf'

    ## Start of program
    print "-- Start of program --"

    def downloadFile():
        print 'Trying to download the .pdf from russell.com'

        try:
            getFile = urllib2.urlopen(rusLoc)
            localFile = open('Russell3000_Membership_List.pdf', 'w')
            localFile.write(getFile.read())
            getFile.close()
            localFile.close()
            print 'File downloaded'

        except Exception, e:
            print 'File did not download. Reason:', str(e)

    def convertFile():
        print 'Trying to convert the file from .pdf to .txt'

        try:
            os.system('pdf2txt.py -o Russell3000_Membership_List.txt Russell3000_Membership_List.pdf')
            os.system('rm Russell3000_Membership_List.pdf')
            print 'File Converted'

        except Excetion, e:
            print 'File could not be converted. Reason:', str(e)

    def processFile():
        print 'trying to process the file for just the ticker information'

        try:
            try:
                with open('Russell_list.txt'):
                    print 'Found existing file. Deleting it'
                    os.system('rm Russell_list.txt')

            except IOError:
                print 'No previous file found.'

            print 'Creating new file'
            display = 0
            localFile = open('Russell3000_Membership_List.txt', 'r')
            for line in localFile:
                if (line.strip()[:6] == 'As of ') or (line.strip() == 'Company'):
                    display = 0

                if (len(line) > 2) and (display == 1):
                    lineContents = line.strip() + ', '
                    #print lineContents
                    writeFile = open('Russell_list.txt','a')
                    writeToFile = writeFile.write(lineContents)
                    writeFile.close()

                if line.strip() == 'Ticker':
                    display = 1

            localFile.close()

        except Exception, e:
            print 'File could not be processed. Reason:', str(e)

    downloadFile()
    convertFile()
    processFile()

    ## End of program
    print "-- End of program --"
  8. FYI: I found pdfquery which is based on pdfminer. The table is well formated so pdfquery should work.
  9. That's a good question. I have always just needed it for a 1 time thing, but, if you wanted to use it for a way to keep an updated list of the Russell 3000 that seems useful. I will look into it and maybe put out a video.
  10. How would you use the pdf directly by downloading and extracting it via py? Amazing topic! Thanks for this playlist.


Additional Information:

Visibility: 3407

Duration: 8m 49s

Rating: 18