Scikit Learn Machine Learning Tutorial for investing with Python p. 6



In this machine learning with Python and scikit-learn tutorial video, we cover how to use the Pandas module to help structure and modify our data. No Pandas? No problem: Windows users: http://www.lfd.uci.edu/~gohlke/pythonlibs/#pandas or Others: http://pandas.pydata.org/ sample code: http://pythonprogramming.net http://seaofbtc.com http://sentdex.com http://hkinsley.com https://twitter.com/sentdex Bitcoin donations: 1GV7srgR4NJx4vrk7avCmmVQQrqmv87ty6

Comments

  1. Oh man, you make it looks easy, great Job.
  2. Good videos, keep it up your work.
  3. +sentdex Thanks for great tuts!!Keep it up...
  4. I had no Programming language experience and start watching ur video after taking python class in codecademy. These are really great videos and I learned a lot from here, thx very much.

    May I have a basic question: When I forget to type "df = " in code:

    "df = df.append({'Date':date_stamp,'Unix':unix_time,'Ticker':ticker,'DE Ratio':value,}, ignore_index = True)"

    nothing is in .csv file produced, why?
  5. Don't normally leave comments, but these videos are fantastic! Well done, keep up the amazing work!
  6. thanks for the video. one question, at the end I run the file, now where exactly is my saved csv file?
    I cannot locate it for the love of ..... thanks.
  7. how to scrape old data from yahoo finance?
    what url can I find said data?
  8. thanks sentdex
  9. Some of source files are have a newline between the closing '</td>' and '<td class="yfnc_tabledata1">'.

    Other files are missing 'Total Debt/Equity (mrq)' altogether.

    Here's how I resolved it:
    ```
    if gather + ':</td>' in source:
    value = source.split(gather + ':</td>')[1].split('<td class="yfnc_tabledata1">')[1].split('</td>')[0]
    else:
    value = 'N/A'
    print ticker + ':' + value
    ```

    Awesome tutorial thus far Harrison!
  10. What's the need of Unix time in our script??
  11. using Mac, Anaconda gets this error.

    ticker = each_dir.split("\\")[1]
    IndexError: list index out of range
  12. Just a thought but maybe try SortedContainers for your data append. It'll be faster than Pandas as it wont do the dataframe copy (which it does on an append), and then finally convert to a dataframe at the end; which is a copy(sadly)
  13. WOW! 40 seconds! That took me 5 minutes in 30% of my FX-6100, I bet you're using an i7, those things are productivity MONSTERS!

    And by the way thanks for the tutorials, great content and simplicity. YouTube needs more people like you.
  14. Hi Sentdex

    Is your code for this available to download, I have typed it in and get 

        value = source.split(gather + ':</td><td class="yfnc_tabledata1">')[1].split('</td>')[0]
    IndexError: list index out of range

    I know its probably a typo but I am going cross eyed trying to find  it.
  15. I've been looking for machine learning tutorials for Python lately, thanks =D
  16. If i look in the CSV file, in my file, the ticker starts at rok and other companies like appl arent even there. Any ideas why this might be the case? (Im on Ubuntu so that might be the problem)
  17. Hi. Could you explain what the 'gather' keyword is? Is it similar to args?


Additional Information:

Visibility: 13933

Duration: 8m 59s

Rating: 72