Today we will create a website analyzing stock market index of Alphabet Inc (of Google), using Python and some supporting libraries:
- pandas : Pandas is an open source library that provides easy-to-use, high-performance data structures and data analysis tools for the Python programming language.
- pandas-datareader : Instead of using a browser to access the website containing datasource to download, this library supports downloading stock market data running in Python language in the form of dataframe .
- bokeh : The library supports visualize data, instead of dry numbers, bokeh helps us visualize data.
The prerequisite for this article is, of course, that you must have a basic knowledge of the Python language and a bit of jupyternotebook – a web-based application that allows running interactive python, otherwise you can skim it as a reference. was curious and asked, “What kind of fun can Python do?”. OK, got it let’s move on!
1. Install pandas_datareader to get data
As mentioned before, the usual way that we still do when looking for data is to use the web browser, type in google to search for the keyword, go to the website providing datasource and download, but use pandas_datareader library. and Python, you can download directly without using any browser. Previously, pandas_datareader was just a module in pandas library, but now it has a standalone library.
To install the library on Windows environment, you just need to type in the command, here I have installed:
pip install pandas_datareader
2. Create code file on jupyer notebook
Get data from datasource
For those who do not know / have never heard of, jupyer notebook is a web-based application that allows to run interactive python, similar to an IDE. I created a new file and named it stock_analysis.py
Next, we will select a stock code to analyze, I like GOOGLE so we will choose GOOGLE to analyze, the corresponding stock code for this company is GOOG, you can choose any other company. as long as the company sells the stock on the stock exchange
First we need to require the two necessary libraries, pandas_datareader and datetime, then perform the get data. DataReader accepts 4 parameters respectively:
- name: name of the stock symbol of the company you want to get (stock symbol)
- datasource: data source, you can choose data source from providers, in addition to yahoo, we also have some other finance data stores such as google, fred, quandl, worldbank, …
- start: start time
- end: end time
The data collected is a two-dimensional data-frame table. If you do not have the knowledge about the indicators in the table, do not worry I will introduce in the next section.
OK, we got a little too much a table with numbers, we will continue to analyze the data table, the meaning of the rows, columns, and numbers.
- If you pay more attention, you will realize that the above table missed 4 days (07 & 08/12/2019 + 14 & 15/12/2019). The reason is simple, the stock market is normally only open on weekdays from Monday to Friday, so there will be no data for the weekends.
- High column: the highest price of the day
- Low column: lowest price of the day
- Open column: price at the beginning of the opening
- Close clumn: closing price at the close
- Volume: the number of shares traded in a day
Within this article, I just briefly introduced the columns, we will only use the columns High, Low, Open, Close only.
Visualize data using candlestick chart
Okay, first we will look at the result we are headed for, answer the question: How the output looks like?
If you do not know the candlestick, you can see more notes here :
I created a new function named inc_des () with the input is the close and open index of each day and the output is the result of comparing two values: Increase, Decrease or Equal Then add from the index in the table, we add Status column with the corresponding status is returned from the inc_dec () function:
def inc_dec(c, o):
if c > o:
elif c < o:
df["Status"]=[inc_dec(c,o) for c, o in zip(df.Close,df.Open)]
At this time, a new column with the column name “Status” is added to the dataframe:
We will start building the coordinate system for the chart using the figure function
p = figure(x_axis_type='datetime', width=1000, height=300)
p.title.text = "Candlestick Chart"
Continue to build rectangles (body of the candle – rectangle) using the rect () function with the coordinates of the rectangle determined as follows. The coordinates of x are determined by the right date of consideration on the OX axis, the coordinates above and below. of the rectangle is determined by the index of open and close, which means that the center of the rectangle on the Oy axis will be equal to the average price of Open and Close, I added a column named Middle to calculate the average value. upper and one Height columns to calculate the height of the rectangle.
df["Middle"] = (df.Open+df.Close)/2
df["Height"] = abs(df.Close-df.Open)
As usual, the days when the value of the stock goes up (close> open), the graph will show in blue, whereas it is usually shown in red (=)) the bloody market) so we will split. separate statements draw rectangles for the increase and decrease date into two separate statements as follows:
Merge all the code and run:
Yeah, almost halfway there, next we will draw lines that show the High – Low index. Simple than drawing a rectangle, we just need to use the segment method to draw straight lines. See more about segments here
p.segment(df.index, df.High, df.index, df.Low, color="Black")
Continue running to see what’s new: -?
OK seems pretty good but still not standard, because segment () is placed after rect (), so the line will overlap the rectangle, to fix this problem, we just need to move the statement just added before the statements Draw a rectangle (rect). Revise and see the results offline: Wow, that looks pretty good. Continuing to get a better overview or to see a bigger picture, we need to buy some time. I will fix start_date from 2019/12/1 to 2019/06/15. So we see the overview from 2019/06/15 to 2019/12/15.
Yay, looks nice Looks dangerous. So in this part 1, I created a simple stock statistics chart, this is just a small part describing the immense power of Python and the collection of countless 3rd libraries supporting . In Part 2, I will guide you to combine the results of part one with a little knowledge of the Flask framework to create a simple website and deploy to Heroku.
Many thanks. Have fun!