Giới thiệu :
Trong bài viết này mình sẽ giới thiệu đến các bạn một công cụ automation trên trình duyệt web mà mình hay sử dụng để tự động hóa các tác vụ trên trình duyệt giống như người dùng thật đó là Selenium. Và mình sẽ sử dụng selenium kết hợp với python để tạo một con bot facebook tự động lấy tin tức mới nhất ở trang https://www.24h.com.vn/ để gửi vào facebook chính của mình.
Cài đặt Selenium với Python
Có rất nhiều bài viết hướng dẫn cài đặt môi trường, các bạn có thể tham khảo tại đây nhé .
Bước 1 : Đăng nhập vào facebook bằng cookie :
Ở đây mình sẽ tạo sẵn một clone facebook và tiến hành lấy cookie để đăng nhập, mình không sử dụng tài khoản và mật khẩu trực tiếp là vì giảm hiện tượng checkpoint trên trình duyệt khi đăng nhập ở trình duyệt lạ. Các bạn tiến hành đăng nhập vào facebook và lấy cookie như hình nhé. Nếu các bạn không muốn có thể đăng nhập bằng tài khoản và mật khẩu nhé và mình đoán sẽ bị chặn ngay từ lần đầu tiên đăng nhập
Đầu tiên lấy cookie của clone như hình nhé :
Tạo script đăng nhập vào facebook bằng Cookie vừa có được :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | <span class="token keyword">from</span> selenium <span class="token keyword">import</span> webdriver <span class="token keyword">from</span> selenium<span class="token punctuation">.</span>webdriver<span class="token punctuation">.</span>chrome<span class="token punctuation">.</span>options <span class="token keyword">import</span> Options <span class="token keyword">import</span> time <span class="token keyword">from</span> datetime <span class="token keyword">import</span> datetime <span class="token keyword">def</span> <span class="token function">initDriver</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span> CHROMEDRIVER_PATH <span class="token operator">=</span> <span class="token string">'/usr/bin/chromedriver'</span> WINDOW_SIZE <span class="token operator">=</span> <span class="token string">"1920,1080"</span> chrome_options <span class="token operator">=</span> Options<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token comment">#chrome_options.add_argument("--headless")</span> chrome_options<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"--window-size=%s"</span> <span class="token operator">%</span> WINDOW_SIZE<span class="token punctuation">)</span> chrome_options<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">'--no-sandbox'</span><span class="token punctuation">)</span> driver <span class="token operator">=</span> webdriver<span class="token punctuation">.</span>Chrome<span class="token punctuation">(</span>executable_path<span class="token operator">=</span>CHROMEDRIVER_PATH<span class="token punctuation">,</span> options<span class="token operator">=</span>chrome_options <span class="token punctuation">)</span> <span class="token keyword">return</span> driver driver<span class="token operator">=</span>initDriver<span class="token punctuation">(</span><span class="token punctuation">)</span> user_receive <span class="token operator">=</span> <span class="token string">"100009116007864"</span> cookie <span class="token operator">=</span> <span class="token string">"Cookie: fr=0ib4TxzI1pcwOr8aG.AWVFTLqO6sKdJ0jOOB5tp81vVgM.Bffts3.2O.AAA.0.0.BfgyU0.AWWUe73A0nA; sb=VgoPX7xwsLcGBg9tY1y_rPgP; datr=VgfgoPX0PyzfdsfoHMD38tzdjrr0uC; _fbp=fb.1.159533sdsfds8346205.1354317687; wd=1832x346; c_user=100039058042146; xs=44%3AQS-_M-LyD-t_ww%3A2%3A160fdsfs2427402%3A19608%3A14949%3A%3AAcW_NFvRVJOj5-W14FBfJ2PZD9G__13Iy8QQ4nWD4Q; spin=r.1002804926_b.trunk_t.1602427404_s.1_v.2_"</span> <span class="token keyword">def</span> <span class="token function">loginFacebookByCookie</span><span class="token punctuation">(</span>cookie<span class="token punctuation">)</span><span class="token punctuation">:</span> script <span class="token operator">=</span> <span class="token string">'javascript:void(function(){ function setCookie(t) { var list = t.split("; "); console.log(list); for (var i = list.length - 1; i >= 0; i--) { var cname = list[i].split("=")[0]; var cvalue = list[i].split("=")[1]; var d = new Date(); d.setTime(d.getTime() + (7*24*60*60*1000)); var expires = ";domain=.facebook.com;expires="+ d.toUTCString(); document.cookie = cname + "=" + cvalue + "; " + expires; } } function hex2a(hex) { var str = ""; for (var i = 0; i < hex.length; i += 2) { var v = parseInt(hex.substr(i, 2), 16); if (v) str += String.fromCharCode(v); } return str; } setCookie("'</span> <span class="token operator">+</span> cookie <span class="token operator">+</span> <span class="token string">'"); location.href = "https://facebook.com"; })();'</span> driver<span class="token punctuation">.</span>execute_script<span class="token punctuation">(</span>script<span class="token punctuation">)</span> |
Cookie trên là của mình nên mình change rồi nhé : ))
Demo login nào :
Bước 2 : Lấy nội dung tin tức mới nhất trên 24h.com.vn :
Đầu tiên, các bạn thực hiện xác định element của các bài viết mới nhất để tiến hành lấy nội dung tiêu đề :
Chuột phải inspect phần tử mình lấy theo xpath:
Script :
1 2 3 4 5 6 7 8 9 10 | <span class="token keyword">def</span> <span class="token function">getNews</span><span class="token punctuation">(</span>driver<span class="token punctuation">)</span><span class="token punctuation">:</span> news_arr <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token punctuation">]</span> driver<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">'https://www.24h.com.vn/tin-tuc-trong-ngay-c46.html'</span><span class="token punctuation">)</span> news <span class="token operator">=</span> driver<span class="token punctuation">.</span>find_elements_by_xpath<span class="token punctuation">(</span><span class="token string">'//*[@id="cated"]/div/section/div/div/div/div/article/div/div/header/h2/a'</span><span class="token punctuation">)</span> <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token builtin">len</span><span class="token punctuation">(</span>news<span class="token punctuation">)</span> <span class="token operator">></span> <span class="token number">0</span><span class="token punctuation">)</span> <span class="token punctuation">:</span> <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span> <span class="token builtin">len</span><span class="token punctuation">(</span>news<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">:</span> text <span class="token operator">=</span> <span class="token builtin">str</span><span class="token punctuation">(</span>news<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">.</span>text<span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token string">": Link nè :"</span> <span class="token operator">+</span> <span class="token builtin">str</span><span class="token punctuation">(</span>news<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">.</span>get_attribute<span class="token punctuation">(</span><span class="token string">'href'</span><span class="token punctuation">)</span><span class="token punctuation">)</span> news_arr<span class="token punctuation">.</span>append<span class="token punctuation">(</span>text<span class="token punctuation">)</span> <span class="token keyword">return</span> news_arr |
Bước 3 : Tạo Script tự động nhắn tin tới facebook sau một khoảng thời gian :
Tương tự mình cũng sẽ thực hiện xác định các element của textarea và button gửi để viết script tự động gửi tin nhắn nhé, các bạn tự inspect phần tử nhé :
Ok. Viết Script nào :
1 2 3 4 5 6 7 | <span class="token keyword">def</span> <span class="token function">sendMessage</span><span class="token punctuation">(</span>message<span class="token punctuation">)</span><span class="token punctuation">:</span> driver<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">'https://mbasic.facebook.com/messages/compose/?ids[0]='</span> <span class="token operator">+</span> user_receive<span class="token punctuation">)</span> text_input <span class="token operator">=</span> driver<span class="token punctuation">.</span>find_elements_by_tag_name<span class="token punctuation">(</span><span class="token string">'textarea'</span><span class="token punctuation">)</span> <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token builtin">len</span><span class="token punctuation">(</span>text_input<span class="token punctuation">)</span> <span class="token operator">></span> <span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">:</span> text_input<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span>send_keys<span class="token punctuation">(</span>message<span class="token punctuation">)</span> driver<span class="token punctuation">.</span>find_element_by_xpath<span class="token punctuation">(</span><span class="token string">'//*[@id="composer_form"]/div[2]/table/tbody/tr/td[2]/input'</span><span class="token punctuation">)</span><span class="token punctuation">.</span>click<span class="token punctuation">(</span><span class="token punctuation">)</span> |
Thành quả :
Đoạn script hoàn chình :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | <span class="token keyword">from</span> selenium <span class="token keyword">import</span> webdriver <span class="token keyword">from</span> selenium<span class="token punctuation">.</span>webdriver<span class="token punctuation">.</span>chrome<span class="token punctuation">.</span>options <span class="token keyword">import</span> Options <span class="token keyword">import</span> time <span class="token keyword">from</span> datetime <span class="token keyword">import</span> datetime <span class="token keyword">def</span> <span class="token function">initDriver</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span> CHROMEDRIVER_PATH <span class="token operator">=</span> <span class="token string">'/usr/bin/chromedriver'</span> WINDOW_SIZE <span class="token operator">=</span> <span class="token string">"1920,1080"</span> chrome_options <span class="token operator">=</span> Options<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token comment">#chrome_options.add_argument("--headless")</span> chrome_options<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"--window-size=%s"</span> <span class="token operator">%</span> WINDOW_SIZE<span class="token punctuation">)</span> chrome_options<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">'--no-sandbox'</span><span class="token punctuation">)</span> driver <span class="token operator">=</span> webdriver<span class="token punctuation">.</span>Chrome<span class="token punctuation">(</span>executable_path<span class="token operator">=</span>CHROMEDRIVER_PATH<span class="token punctuation">,</span> options<span class="token operator">=</span>chrome_options <span class="token punctuation">)</span> <span class="token keyword">return</span> driver driver<span class="token operator">=</span>initDriver<span class="token punctuation">(</span><span class="token punctuation">)</span> user_receive <span class="token operator">=</span> <span class="token string">"100009116007864"</span> <span class="token comment">#uid facebook người nhận</span> cookie <span class="token operator">=</span> <span class="token string">"Cookie: fr=0ib4TxzI1pcwOr8aG.AWVFTLqO6sKdJ0jOOB5tp81vVgM.Bffts3.2O.AAA.0.0.BfgyU0.AWWUe73A0nA; sb=VgoPX7xwsLbcGBg9cxtY1y_rPgP; datr=VgoPX0PyzoHMD38tzdjrr0uC; _fbp=fb.1.15953383fsdfdsfdf46205.1354317687; wd=1832x346; c_user=100039058042146; xs=44%3AQS-_M-LyD-t_ww%3A2%3A1602427402%3A19608%3A14949%3A%3AAcW_NFvRVJOj5-W14FBfJ2PZD9G__13Iy8QQ4nWD4Q; spin=r.1002804926_b.trunk_t.1602fds427404_s.1_v.2_"</span> <span class="token comment">#cookie clone, mọi người dùng cookie mới nhé</span> time_delay<span class="token operator">=</span> <span class="token number">60</span> <span class="token comment"># thời gian delay để gửi tin tức</span> <span class="token keyword">def</span> <span class="token function">getNews</span><span class="token punctuation">(</span>driver<span class="token punctuation">)</span><span class="token punctuation">:</span> news_arr <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token punctuation">]</span> driver<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">'https://www.24h.com.vn/tin-tuc-trong-ngay-c46.html'</span><span class="token punctuation">)</span> news <span class="token operator">=</span> driver<span class="token punctuation">.</span>find_elements_by_xpath<span class="token punctuation">(</span><span class="token string">'//*[@id="cated"]/div/section/div/div/div/div/article/div/div/header/h2/a'</span><span class="token punctuation">)</span> <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token builtin">len</span><span class="token punctuation">(</span>news<span class="token punctuation">)</span> <span class="token operator">></span> <span class="token number">0</span><span class="token punctuation">)</span> <span class="token punctuation">:</span> <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span> <span class="token builtin">len</span><span class="token punctuation">(</span>news<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">:</span> text <span class="token operator">=</span> <span class="token builtin">str</span><span class="token punctuation">(</span>news<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">.</span>text<span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token string">": Link nè :"</span> <span class="token operator">+</span> <span class="token builtin">str</span><span class="token punctuation">(</span>news<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">.</span>get_attribute<span class="token punctuation">(</span><span class="token string">'href'</span><span class="token punctuation">)</span><span class="token punctuation">)</span> news_arr<span class="token punctuation">.</span>append<span class="token punctuation">(</span>text<span class="token punctuation">)</span> <span class="token keyword">return</span> news_arr <span class="token keyword">def</span> <span class="token function">sendMessage</span><span class="token punctuation">(</span>message<span class="token punctuation">)</span><span class="token punctuation">:</span> driver<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">'https://mbasic.facebook.com/messages/compose/?ids[0]='</span> <span class="token operator">+</span> user_receive<span class="token punctuation">)</span> text_input <span class="token operator">=</span> driver<span class="token punctuation">.</span>find_elements_by_tag_name<span class="token punctuation">(</span><span class="token string">'textarea'</span><span class="token punctuation">)</span> <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token builtin">len</span><span class="token punctuation">(</span>text_input<span class="token punctuation">)</span> <span class="token operator">></span> <span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">:</span> text_input<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span>send_keys<span class="token punctuation">(</span>message<span class="token punctuation">)</span> driver<span class="token punctuation">.</span>find_element_by_xpath<span class="token punctuation">(</span><span class="token string">'//*[@id="composer_form"]/div[2]/table/tbody/tr/td[2]/input'</span><span class="token punctuation">)</span><span class="token punctuation">.</span>click<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">def</span> <span class="token function">loginFacebookByCookie</span><span class="token punctuation">(</span>cookie<span class="token punctuation">)</span><span class="token punctuation">:</span> script <span class="token operator">=</span> <span class="token string">'javascript:void(function(){ function setCookie(t) { var list = t.split("; "); console.log(list); for (var i = list.length - 1; i >= 0; i--) { var cname = list[i].split("=")[0]; var cvalue = list[i].split("=")[1]; var d = new Date(); d.setTime(d.getTime() + (7*24*60*60*1000)); var expires = ";domain=.facebook.com;expires="+ d.toUTCString(); document.cookie = cname + "=" + cvalue + "; " + expires; } } function hex2a(hex) { var str = ""; for (var i = 0; i < hex.length; i += 2) { var v = parseInt(hex.substr(i, 2), 16); if (v) str += String.fromCharCode(v); } return str; } setCookie("'</span> <span class="token operator">+</span> cookie <span class="token operator">+</span> <span class="token string">'"); location.href = "https://facebook.com"; })();'</span> driver<span class="token punctuation">.</span>execute_script<span class="token punctuation">(</span>script<span class="token punctuation">)</span> driver<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"https://facebook.com/"</span><span class="token punctuation">)</span> loginFacebookByCookie<span class="token punctuation">(</span>cookie<span class="token punctuation">)</span> <span class="token keyword">while</span> <span class="token boolean">True</span><span class="token punctuation">:</span> <span class="token keyword">try</span><span class="token punctuation">:</span> time<span class="token punctuation">.</span>sleep<span class="token punctuation">(</span>time_delay<span class="token punctuation">)</span> sendMessage<span class="token punctuation">(</span><span class="token string">'Tin tức mới trong ngày nè Xếp : '</span><span class="token punctuation">)</span> news <span class="token operator">=</span> getNews<span class="token punctuation">(</span>driver<span class="token punctuation">)</span> <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span> <span class="token builtin">len</span><span class="token punctuation">(</span>news<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">:</span> time<span class="token punctuation">.</span>sleep<span class="token punctuation">(</span><span class="token number">30</span><span class="token punctuation">)</span> sendMessage<span class="token punctuation">(</span><span class="token builtin">str</span><span class="token punctuation">(</span>news<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token keyword">except</span><span class="token punctuation">:</span> <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">'err'</span><span class="token punctuation">)</span> driver<span class="token punctuation">.</span>close<span class="token punctuation">(</span><span class="token punctuation">)</span> |
Sau khi hoàn thành đoạn script trên thì mình có thể deploy lên server vps ubuntu và mình đã sở hữu một con bot giúp mình tiếp cận với tin tức mới hàng ngày . (yaoming!!)
Cảm ơn các bạn đã đọc bài nhé !