if you run python see this problem related to Anaconda3
Anaconda3\lib\site-packages\numpy\__init__.py:140: UserWarning: mkl-service package failed to import, therefore Intel(R) MKL initialization ensuring its correct out-of-the box operation under condition when Gnu OpenMP had already been loaded by Python process is not assured. Please install mkl-service package, see http://github.com/IntelPython/mkl-service
Solution:
go to environment setting and add this in it: C:\Users\username\Anaconda3\Library\bin
Monday, June 22, 2020
Sunday, June 7, 2020
python with database
import sqlite3
conn = sqlite3.connect('D:\\hskio_python.db')
try:
info = []
cur = conn.cursor()
rows = cur.execute('select * from person')
for row in rows:
id = row[0]
hei ght = row[2]
weight = row[3]
bmi = round(weight/height**2, 2)
print(id, height, weight, bmi)
info.append([bmi, id])
for data in info:
cur.execute('update person set bmi=%d where id=%d' % (data[0], data[1]))
conn.commit()
finally:
conn.close()
beautifulsoup example
beautifulsoup example
Find and Findall Parameter:
Find Parameter:
beautifulsoup example Find h1 tag
beautifulsoup print with tag and without
With tag
Findall parameter:
Find class name
Find and Findall Parameter:
findAll(tag, attributes, recursive, text, limit, keywords) find(tag, attributes, recursive, text, keywords)
Find Parameter:
beautifulsoup example Find h1 tag
import requests from bs4 import BeautifulSoup resp=requests.get("https://code-gym.github.io/spider_demo/") soup=BeautifulSoup(resp.text, 'html5lib') print(soup.find('h1'))
beautifulsoup print with tag and without
With tag
print(soup.find('h1'))
Without tag
print(soup.h1)
Findall parameter:
for h3 in soup.find_all('h3'): print(h3)
Find class name
for title in soup.find_all('h3','post-title'): print(title)
beautifulsoup crawl ptt sock
去ptt 股票爬文章
import requests
from bs4 import BeautifulSoup
import time
today = time.strftime('%m/%d').lstrip('0')
def ptt(url):
resp = requests.get(url)
if resp.status_code != 200:
print('URL發生錯誤:' + url)
return
soup = BeautifulSoup(resp.text, 'html5lib')
paging = soup.find('div', 'btn-group btn-group-paging').find_all('a')[1]['href']
articles = []
rents = soup.find_all('div', 'r-ent')
for rent in rents:
title = rent.find('div', 'title').text.strip()
count = rent.find('div', 'nrec').text.strip()
date = rent.find('div', 'meta').find('div', 'date').text.strip()
article = '%s %s:%s' % (date, count, title)
try:
if today == date and int(count) > 10:
articles.append(article)
except:
if today == date and count == '爆':
articles.append(article)
if len(articles) != 0:
for article in articles:
print(article)
ptt('https://www.ptt.cc' + paging)
else:
return
ptt('https://www.ptt.cc/bbs/Stock/index.html')
Thursday, June 4, 2020
python changed pip
if our install pyton2 and python3 on your PC, it might used python2's pip. you can used the command
pip --version
D:\selenium-3.141.0.tar\dist\selenium-3.141.0>pip --version
pip 20.1.1 from c:\python27\lib\site-packages\pip (python 2.7)
you can also do like this :
C:\python37\Scripts>pip3.exe install packagename
reference:
https://stackoverflow.com/questions/39851566/using-pip-on-windows-installed-with-both-python-2-7-and-3-5
https://stackoverflow.com/questions/40832533/pip-or-pip3-to-install-packages-for-python-3
pip --version
D:\selenium-3.141.0.tar\dist\selenium-3.141.0>pip --version
pip 20.1.1 from c:\python27\lib\site-packages\pip (python 2.7)
you can also do like this :
#python36\Scripts\pip.exe install packagename
Example: C:\python37\Scripts>pip3.exe install packagename
reference:
https://stackoverflow.com/questions/39851566/using-pip-on-windows-installed-with-both-python-2-7-and-3-5
https://stackoverflow.com/questions/40832533/pip-or-pip3-to-install-packages-for-python-3
selenium problem
This is a interesting topic and funny thing about selenium, after surfing on the net, i find this article which really solve the problem.
Problem: Used pip to install selenium and show install success. But module ONLY work on Python2 BUT Python3 DON'T work. Sound really strange, isn't.
Solution: So just download selenium package and manual install.
How: Just extract the file and go to the directory and used the command will install:
python
Conclusion is we have to manual install selenium .
Wednesday, June 3, 2020
Seliunm
Chrome diver: chromedriver
https://chromedriver.chromium.org/downloads
Firefox driver: geckodriver
https://github.com/mozilla/geckodriver/releases
Basic Selenium
Selenium with beautifulsoup example 1: will pop chrome
Selenium with beautifulsoup example 2: will run chrome at daemon
Selenium with beautifulsoup using xpath to find related article
https://chromedriver.chromium.org/downloads
Firefox driver: geckodriver
https://github.com/mozilla/geckodriver/releases
Basic Selenium
from selenium import webdriver
browser=webdriver.Chrome('D:\\chromedriver.exe')
browser.get('http://google.com')
browser.quit()
Selenium with beautifulsoup example 1: will pop chrome
from selenium import webdriver
from bs4 import BeautifulSoup
try:
chrome=webdriver.Chrome(executable_path='D:\\CHROME_DRIVER\\chromedriver.exe')
chrome.set_page_load_timeout(10)
chrome.get('https://code-gym.github.io/spider_demo/')
soup = BeautifulSoup(chrome.page_source, 'html5lib')
print(soup.find('h1').text)
finally:
browser.quit()
Selenium with beautifulsoup example 2: will run chrome at daemon
from selenium import webdriver
from bs4 import BeautifulSoup
try:
options = webdriver.ChromeOptions()
options.add_argument('--headless')
chrome=webdriver.Chrome(options=options,executable_path='D:\\CHROME_DRIVER\\chromedriver.exe')
chrome.set_page_load_timeout(10)
chrome.get('https://code-gym.github.io/spider_demo/')
soup = BeautifulSoup(chrome.page_source, 'html5lib')
print(soup.find('h1').text)
finally:
browser.quit()
Selenium with beautifulsoup using xpath to find related article
from selenium import webdriver
from bs4 import BeautifulSoup
try:
. options = webdriver.ChromeOptions()
..........................
..........................
..........................
print(soup.find('h1').text)
chrome.find_element_by_xpath('/html/body/div[2]/div/div[1]/div[1]/div/div/h3/a').click(
print(chrome.find_element_by_xpath('//*[@id="post-header"]/div[2]/div/div/h1').text)
finally:
browser.quit()
Subscribe to:
Posts (Atom)