Introduce
This article guides you who are new to python syntax and can use python’s requests library to simply download an Instagram account image.
Prepare
install requests library: pip install requests
Analysis
First, we access the page of the user needing to crawl the F12 -> network image, easily find the api to get instagram photos
1 2 |
https://www.instagram.com/graphql/query/?query_hash=003056d32c2554def87228bc3fd9668a&variables=%7B%22id%22%3A%224499737748%22%2C%22first%22%3A12%2C%22after%22%3A%22QVFEU0wtaE15VUNGLUd5dXNKR0FHbWx2UmlKS0ZlcDZBVXpFNkdTeXhycFN4SHVhVWJwZzNsTld0cU1xS1RLa1huT2w0X0dnS0tLWnVfUVlsNU5JOTJKRw%3D%3D%22%7D |
The api has the form:
1 2 |
https://www.instagram.com/graphql/query/?query_hash=003056d32c2554def87228bc3fd9668a&variables={"id":"4499737748","first":12,"after":"QVFEX0l4TElsblNiSklTSDJaXzZsLUE3ajlvTE44UktYR2lPNm1SOWtRWmR2d21VZWJNUEJKdHVXU3hIOGNDS2FKQWNhdVBaZk5wZGpmMGRkTG1rZTV6Tg=="} |
first: the number of images will start from after. With after = “” we get the first 12 images (after = end_cursor of requests before it)
So our crawl will be the first api request -> Image crawl, end_cursor, check if the page is behind? -> send api again with end_cursor in previous api call if available.
Image crawl process: Using results from api requests -> switch to json -> check whether the post is a photo or video-> Check to see if there are other images -> get image urls
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
nextLink <span class="token operator">=</span> <span class="token string">'https://www.instagram.com/graphql/query/?query_hash=003056d32c2554def87228bc3fd9668a&variables={"id":"'</span> <span class="token operator">+</span> <span class="token builtin">id</span> <span class="token operator">+</span> <span class="token string">'","first":12,"after":"'</span> <span class="token operator">+</span> end <span class="token operator">+</span> <span class="token string">'"}'</span> res <span class="token operator">=</span> r <span class="token punctuation">.</span> get <span class="token punctuation">(</span> nextLink <span class="token punctuation">)</span> <span class="token punctuation">.</span> json <span class="token punctuation">(</span> <span class="token punctuation">)</span> edges <span class="token operator">=</span> res <span class="token punctuation">[</span> <span class="token string">'data'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'user'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edge_owner_to_timeline_media'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edges'</span> <span class="token punctuation">]</span> <span class="token keyword">for</span> e <span class="token keyword">in</span> edges <span class="token punctuation">:</span> is_video <span class="token operator">=</span> e <span class="token punctuation">[</span> <span class="token string">'node'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'is_video'</span> <span class="token punctuation">]</span> <span class="token keyword">if</span> <span class="token punctuation">(</span> is_video <span class="token keyword">is</span> <span class="token boolean">False</span> <span class="token punctuation">)</span> <span class="token punctuation">:</span> link <span class="token punctuation">.</span> append <span class="token punctuation">(</span> e <span class="token punctuation">[</span> <span class="token string">'node'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'display_url'</span> <span class="token punctuation">]</span> <span class="token punctuation">)</span> <span class="token keyword">if</span> <span class="token string">"edge_sidecar_to_children"</span> <span class="token keyword">in</span> e <span class="token punctuation">[</span> <span class="token string">'node'</span> <span class="token punctuation">]</span> <span class="token punctuation">:</span> ne <span class="token operator">=</span> e <span class="token punctuation">[</span> <span class="token string">'node'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edge_sidecar_to_children'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edges'</span> <span class="token punctuation">]</span> <span class="token keyword">for</span> nee <span class="token keyword">in</span> ne <span class="token punctuation">:</span> is_video <span class="token operator">=</span> nee <span class="token punctuation">[</span> <span class="token string">'node'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'is_video'</span> <span class="token punctuation">]</span> <span class="token keyword">if</span> <span class="token punctuation">(</span> is_video <span class="token keyword">is</span> <span class="token boolean">False</span> <span class="token punctuation">)</span> <span class="token punctuation">:</span> link <span class="token punctuation">.</span> append <span class="token punctuation">(</span> nee <span class="token punctuation">[</span> <span class="token string">'node'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'display_url'</span> <span class="token punctuation">]</span> <span class="token punctuation">)</span> |
Check if there is a next page and get end_cursor:
1 2 3 |
end <span class="token operator">=</span> res <span class="token punctuation">[</span> <span class="token string">'data'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'user'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edge_owner_to_timeline_media'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'page_info'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'end_cursor'</span> <span class="token punctuation">]</span> check <span class="token operator">=</span> res <span class="token punctuation">[</span> <span class="token string">'data'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'user'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edge_owner_to_timeline_media'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'page_info'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'has_next_page'</span> <span class="token punctuation">]</span> |
Finally, create a new folder and upload the photos:
1 2 3 4 |
current_path <span class="token operator">=</span> os <span class="token punctuation">.</span> getcwd <span class="token punctuation">(</span> <span class="token punctuation">)</span> <span class="token keyword">try</span> <span class="token punctuation">:</span> os <span class="token punctuation">.</span> mkdir <span class="token punctuation">(</span> current_path <span class="token operator">+</span> <span class="token string">"\"</span> <span class="token operator">+</span> <span class="token builtin">id</span> <span class="token operator">+</span> <span class="token string">"\"</span> <span class="token punctuation">)</span> <span class="token keyword">except</span> <span class="token punctuation">:</span> <span class="token keyword">pass</span> |
1 2 3 4 5 6 7 |
<span class="token keyword">for</span> l <span class="token keyword">in</span> link <span class="token punctuation">:</span> file_name <span class="token operator">=</span> <span class="token builtin">str</span> <span class="token punctuation">(</span> l <span class="token punctuation">)</span> <span class="token punctuation">.</span> split <span class="token punctuation">(</span> <span class="token string">'/'</span> <span class="token punctuation">)</span> <span class="token punctuation">[</span> <span class="token operator">-</span> <span class="token number">1</span> <span class="token punctuation">]</span> <span class="token punctuation">.</span> split <span class="token punctuation">(</span> <span class="token string">'?'</span> <span class="token punctuation">)</span> <span class="token punctuation">[</span> <span class="token number">0</span> <span class="token punctuation">]</span> <span class="token keyword">with</span> <span class="token builtin">open</span> <span class="token punctuation">(</span> <span class="token builtin">id</span> <span class="token operator">+</span> <span class="token string">'/'</span> <span class="token operator">+</span> file_name <span class="token punctuation">,</span> <span class="token string">"wb"</span> <span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token builtin">file</span> <span class="token punctuation">:</span> response <span class="token operator">=</span> r <span class="token punctuation">.</span> get <span class="token punctuation">(</span> l <span class="token punctuation">)</span> <span class="token builtin">file</span> <span class="token punctuation">.</span> write <span class="token punctuation">(</span> response <span class="token punctuation">.</span> content <span class="token punctuation">)</span> <span class="token builtin">file</span> <span class="token punctuation">.</span> close <span class="token punctuation">(</span> <span class="token punctuation">)</span> |
Full Code:
You can download the source code, replace the id with the id found in the api to start downloading the image.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
<span class="token keyword">import</span> requests <span class="token keyword">as</span> r <span class="token keyword">import</span> os <span class="token builtin">id</span> <span class="token operator">=</span> <span class="token string">'3762891297'</span> current_path <span class="token operator">=</span> os <span class="token punctuation">.</span> getcwd <span class="token punctuation">(</span> <span class="token punctuation">)</span> <span class="token keyword">try</span> <span class="token punctuation">:</span> os <span class="token punctuation">.</span> mkdir <span class="token punctuation">(</span> current_path <span class="token operator">+</span> <span class="token string">"\"</span> <span class="token operator">+</span> <span class="token builtin">id</span> <span class="token operator">+</span> <span class="token string">"\"</span> <span class="token punctuation">)</span> <span class="token keyword">except</span> <span class="token punctuation">:</span> <span class="token keyword">pass</span> linkStart <span class="token operator">=</span> <span class="token string">'https://www.instagram.com/graphql/query/?query_hash=003056d32c2554def87228bc3fd9668a&variables={"id":"'</span> <span class="token operator">+</span> <span class="token builtin">id</span> <span class="token operator">+</span> <span class="token string">'","first":12,"after":""}'</span> <span class="token keyword">print</span> <span class="token punctuation">(</span> linkStart <span class="token punctuation">)</span> nextLink <span class="token operator">=</span> <span class="token string">''</span> firstres <span class="token operator">=</span> r <span class="token punctuation">.</span> get <span class="token punctuation">(</span> linkStart <span class="token punctuation">)</span> <span class="token punctuation">.</span> json <span class="token punctuation">(</span> <span class="token punctuation">)</span> check <span class="token operator">=</span> firstres <span class="token punctuation">[</span> <span class="token string">'data'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'user'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edge_owner_to_timeline_media'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'page_info'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'has_next_page'</span> <span class="token punctuation">]</span> end <span class="token operator">=</span> firstres <span class="token punctuation">[</span> <span class="token string">'data'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'user'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edge_owner_to_timeline_media'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'page_info'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'end_cursor'</span> <span class="token punctuation">]</span> <span class="token comment"># while(check != False):</span> link <span class="token operator">=</span> <span class="token punctuation">[</span> <span class="token punctuation">]</span> <span class="token keyword">while</span> <span class="token punctuation">(</span> check <span class="token operator">!=</span> <span class="token boolean">False</span> <span class="token punctuation">)</span> <span class="token punctuation">:</span> nextLink <span class="token operator">=</span> <span class="token string">'https://www.instagram.com/graphql/query/?query_hash=003056d32c2554def87228bc3fd9668a&variables={"id":"'</span> <span class="token operator">+</span> <span class="token builtin">id</span> <span class="token operator">+</span> <span class="token string">'","first":12,"after":"'</span> <span class="token operator">+</span> end <span class="token operator">+</span> <span class="token string">'"}'</span> res <span class="token operator">=</span> r <span class="token punctuation">.</span> get <span class="token punctuation">(</span> nextLink <span class="token punctuation">)</span> <span class="token punctuation">.</span> json <span class="token punctuation">(</span> <span class="token punctuation">)</span> edges <span class="token operator">=</span> res <span class="token punctuation">[</span> <span class="token string">'data'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'user'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edge_owner_to_timeline_media'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edges'</span> <span class="token punctuation">]</span> <span class="token keyword">for</span> e <span class="token keyword">in</span> edges <span class="token punctuation">:</span> is_video <span class="token operator">=</span> e <span class="token punctuation">[</span> <span class="token string">'node'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'is_video'</span> <span class="token punctuation">]</span> <span class="token keyword">if</span> <span class="token punctuation">(</span> is_video <span class="token keyword">is</span> <span class="token boolean">False</span> <span class="token punctuation">)</span> <span class="token punctuation">:</span> link <span class="token punctuation">.</span> append <span class="token punctuation">(</span> e <span class="token punctuation">[</span> <span class="token string">'node'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'display_url'</span> <span class="token punctuation">]</span> <span class="token punctuation">)</span> <span class="token keyword">if</span> <span class="token string">"edge_sidecar_to_children"</span> <span class="token keyword">in</span> e <span class="token punctuation">[</span> <span class="token string">'node'</span> <span class="token punctuation">]</span> <span class="token punctuation">:</span> ne <span class="token operator">=</span> e <span class="token punctuation">[</span> <span class="token string">'node'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edge_sidecar_to_children'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edges'</span> <span class="token punctuation">]</span> <span class="token keyword">for</span> nee <span class="token keyword">in</span> ne <span class="token punctuation">:</span> is_video <span class="token operator">=</span> nee <span class="token punctuation">[</span> <span class="token string">'node'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'is_video'</span> <span class="token punctuation">]</span> <span class="token keyword">if</span> <span class="token punctuation">(</span> is_video <span class="token keyword">is</span> <span class="token boolean">False</span> <span class="token punctuation">)</span> <span class="token punctuation">:</span> link <span class="token punctuation">.</span> append <span class="token punctuation">(</span> nee <span class="token punctuation">[</span> <span class="token string">'node'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'display_url'</span> <span class="token punctuation">]</span> <span class="token punctuation">)</span> end <span class="token operator">=</span> res <span class="token punctuation">[</span> <span class="token string">'data'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'user'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edge_owner_to_timeline_media'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'page_info'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'end_cursor'</span> <span class="token punctuation">]</span> check <span class="token operator">=</span> res <span class="token punctuation">[</span> <span class="token string">'data'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'user'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'edge_owner_to_timeline_media'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'page_info'</span> <span class="token punctuation">]</span> <span class="token punctuation">[</span> <span class="token string">'has_next_page'</span> <span class="token punctuation">]</span> <span class="token keyword">print</span> <span class="token punctuation">(</span> <span class="token builtin">len</span> <span class="token punctuation">(</span> link <span class="token punctuation">)</span> <span class="token punctuation">)</span> <span class="token keyword">for</span> l <span class="token keyword">in</span> link <span class="token punctuation">:</span> file_name <span class="token operator">=</span> <span class="token builtin">str</span> <span class="token punctuation">(</span> l <span class="token punctuation">)</span> <span class="token punctuation">.</span> split <span class="token punctuation">(</span> <span class="token string">'/'</span> <span class="token punctuation">)</span> <span class="token punctuation">[</span> <span class="token operator">-</span> <span class="token number">1</span> <span class="token punctuation">]</span> <span class="token punctuation">.</span> split <span class="token punctuation">(</span> <span class="token string">'?'</span> <span class="token punctuation">)</span> <span class="token punctuation">[</span> <span class="token number">0</span> <span class="token punctuation">]</span> <span class="token keyword">with</span> <span class="token builtin">open</span> <span class="token punctuation">(</span> <span class="token builtin">id</span> <span class="token operator">+</span> <span class="token string">'/'</span> <span class="token operator">+</span> file_name <span class="token punctuation">,</span> <span class="token string">"wb"</span> <span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token builtin">file</span> <span class="token punctuation">:</span> response <span class="token operator">=</span> r <span class="token punctuation">.</span> get <span class="token punctuation">(</span> l <span class="token punctuation">)</span> <span class="token builtin">file</span> <span class="token punctuation">.</span> write <span class="token punctuation">(</span> response <span class="token punctuation">.</span> content <span class="token punctuation">)</span> <span class="token builtin">file</span> <span class="token punctuation">.</span> close <span class="token punctuation">(</span> <span class="token punctuation">)</span> link <span class="token operator">=</span> <span class="token punctuation">[</span> <span class="token punctuation">]</span> <span class="token keyword">if</span> <span class="token punctuation">(</span> check <span class="token operator">==</span> <span class="token boolean">False</span> <span class="token punctuation">)</span> <span class="token punctuation">:</span> <span class="token keyword">break</span> |
p / s: code is written in the most rudimentary way, encouraging revision to be cleaner.