Scikit Learn Machine Learning for investing Tutorial with Python p. 4



In this tutorial, we cover many of the concepts required for how we're going to handle our data set for machine learning. We then cover some basic code regarding how we can pull specific data points out of the file. Data download: http://pythonprogramming.net/static/downloads/machine-learning-data/intraQuarter.zip Sample code: http://pythonprogramming.net http://seaofbtc.com http://sentdex.com http://hkinsley.com https://twitter.com/sentdex Bitcoin donations: 1GV7srgR4NJx4vrk7avCmmVQQrqmv87ty6

Comments

  1. hey, I 've got problem with os.walk, it doesn't work it return empty
    this is my code I run it on ubuntu 14.04
    path = "home/asma/machinelearning/intraQuarter"
    statspath = path + '_KeyStats'
    stock_list = ([x[0] for x in os.walk(statspath)])
  2. hi guys, i am new learner here, I am using Pycharm for this tutorial. some how I can't see the result if i just hit run button. the only result appear is 'Process finished with exit code 0' . if I run in console, i must copy/paste line by line.

    please help, anyone .. Thank you.
  3. 16:15 I think these are called 'format specifiers'
  4. Hi I'm running the program above the way described in your tutorial and nothing ends up running. I think the script isn't able to pull the data from the file. I use a mac and my path designation is: path = "/Users/princeghosh/Desktop/intraQuarter". My file is saved on my desktop. Is there anything you would reccomend?
  5. I have my own data. How do I import it into the "jupyter" Anaconda notebook so I can work with it??

    -CHRIS
  6. I am planning to perform the analysis on Indian Stock Markets. How do we download the historical data for Indian Stock Markets?
  7. sir! I want to make an auto summarization app. I tried to collect summarization data from the internet. but I could not find it. please tell me website where I can get the training data
  8. Can this be done in py 3.6?
  9. Hey could you show us in another video perhaps, how you parsed all these HTML files? Maybe you told us and I didn't hear but It'd be super helpful, I'd really like to have the latest versions of all these HTML files and understand how you gathered the data like this.

    I'm gonna assume it was something like looping through the S&P index and just writing the whole HTML file of each stock to a folder.
  10. What are the use cases for using your zipline series vs. following this series?
  11. I love these videos.I thank you for all your videos. Few questions if you can help.
    1 :Adjusted Close is not available in Indian Indices or many other indices, how to get them . The only source is Yahoo finance which is not beyond doubts. SBIN(State Bank of India) is such example where I found anomalies in the Adjusted Close. for 2015.
    2. keystats folder is inconsistent. For few stocks there are more than 3 entries per year and for few there are either one or two. Why not we have same number of files for each stock?
    3.how often the stock fundamentals change yearly..
    4. The features(stock fundamentals) available , how would they help improving the accuracy or better prediction.?
  12. can you teach us how to spider data from yahoo? Thank you so much!! I learn a lot from your vedio!!
  13. Hi Harrison! I want to do all ur https://pythonprogramming.net/data-analysis-tutorials/ for the summer , I don't exactly know if there exists any specific sequence to follow the videos...or if u have any suggestions ...thank you again for all ur kindness.
  14. Very detailed and comprehensible! Great job!!
    Thank you for making these tutorials!
  15. Hi, why doesn't my python have timetuple function? Thanks.
  16. These tutorials are extremely insightful, thank you for continuing to create this high quality content sentdex!
  17. Hey Sentdex,

    I've really been enjoying this series, and mostly I've had no problems following along. I'm getting some really strange output, though, from the last few lines of this code.

    "print(date_stamp)" results in "2008-04-01 19:13:36" and so on.
    "print(unix_time)" results in "1229274816.0" and so on.

    Put those two together, and I ought to have the same format as your output, right?

    But "print(date_stamp, unix_time)" results in "(datetime.datetime(2008, 12, 14, 19, 12, 36), 1229274816.0)"

    Any idea why this is happening?
  18. I love you.
  19. I really like your videos and insight on machine learning as well as using it in markets. However, in context of financial markets, I strongly think you are looking at the data the wrong way.. or more along the lines of missing important data sets. Also, have you ever got around finding any api/service to get calculated EMA or MACD  values historical as well as live? I think manually calculating those are just tedious.. and becomes heavy work when larger data sets are needed for certain technical indicator calculations..
  20. Hi there, For some reason when I save and run the first part of the function Key_Stats appears an empty list. I have been checking the code, and I found that the stock_list variable is not working well for me. I have been trying to find what kind of mistake made but I didn't find until now, Could anybody help me to find what kind of mistake I made please?. My code is following bellow

    import pandas as pd
    import os
    import time
    from datetime import datetime

    path = "C:\Backup\intraQuarter"

    def Key_Stats(gather="Total Debt/Equity (mrq)"):
        statspath = path+'/_KeyStats'
        stock_list = [x[0] for x in os.walk(statspath)]
        print(stock_list)

    Key_Stats()


Additional Information:

Visibility: 24622

Duration: 18m 19s

Rating: 128