本项目使用python语言编写,采用Flaskweb框架来实现前后端交互,利于开发,维护,前端使用Html和jQuery处理事件,发送数据等,后端采用requests库,BeautifulSoup库实现爬取中国气象局的数据,清洗转化成对应表格数据格式,再使用pyecharts绘制图形,返回给前端页面实现实时展示,注意运行本项目需要联网!!!
输入你要查询的城市,点击搜索即可,由于网速,pyecharts的图形渲染等因素,图形展示需等待几秒才出现。
注意:水球图的渲染有时出不来,可多次点击搜索即可,我是这样的啦!!(与pyecharts图形渲染有关)
其中app.py为项目的启动文件及路由,templates包存放前端页面的,service包存放后端逻辑代码venv为排除目录(没啥用可不创建),包可以自己标记为对用的资源目录,我没有使用flask模板创建,而是自己标记的
创建好包就可以书写代码啦!!!
import sys import pandas as pd import requests from bs4 import BeautifulSoup # 请求头可写,但我看没报错,就没写 headers = { 'user-agent': '', 'Cookie': '' } # 列表划分,例如[1,2,3,4]=>[[1,2],[3,4]],目的适应pyecharts数据 def chunk_list(lst, size): return [lst[i:i + size] for i in range(0, len(lst), size)] # 数据图一的部分数据列表 def data1(soup): seven_dayList = [i.text.strip().replace(' ', '').replace('\n', '') for i in soup.select('.day-item')] temp = [] # 数据清洗 j = 0 for i in seven_dayList: if j % 10 == 0: temp.append('2023年' + i[3:5] + '月' + i[6:] + '日') temp.append(i[:3]) if j % 10 == 5: temp.append(i[:3]) temp.append(i[3:]) if i != '' and j % 10 != 0 and j % 10 != 5: temp.append(i) j += 1 # 数据格式对应表格 seven_dayLists = chunk_list(temp, 10) return seven_dayLists # 图二表格数据列表 def data2(soup): total_list = [i.text.strip() for i in soup.select('.hour-table td')] temp = [] for i in total_list: if i != '' and i != '天气': temp.append(i) # 时间 time_list = [] # 气温 temperature_list = [] # 降水 rainfall_list = [] # 风速 windspeed_list = [] # 风向 winddirection_list = [] # 气压 pressure_list = [] # 湿度 humidity_list = [] # 云量 cloud_list = [] categories = ['时间', '气温', '降水', '风速', '风向', '气压', '湿度', '云量'] current_category = None for item in temp: if item in categories: current_category = item else: if current_category == '时间': time_list.append(item) elif current_category == '气温': temperature_list.append(item) elif current_category == '降水': rainfall_list.append(item) elif current_category == '风速': windspeed_list.append(item) elif current_category == '风向': winddirection_list.append(item) elif current_category == '气压': pressure_list.append(item) elif current_category == '湿度': humidity_list.append(item) elif current_category == '云量': cloud_list.append(item) # 切割 return chunk_list(time_list, 8), chunk_list(temperature_list, 8), chunk_list(rainfall_list, 8), \ chunk_list(windspeed_list, 8), chunk_list(winddirection_list, 8), chunk_list(pressure_list, 8), \ chunk_list(humidity_list, 8), chunk_list(cloud_list, 8) # 数据转换(表格所需) def data_change(seven_dayList, temperature_list, windspeed_list, pressure_list, humidity_list, cloud_list): # 将seven_dayList切出最高气温和最低气温 data_list = [] high_temperature = [] low_temperature = [] for i in seven_dayList: data_list.append(i[0]) high_temperature.append(eval(i[5][:-1])) low_temperature.append(eval(i[6][:-1])) # 将temperature_list转化为数字 temperature_lists = [] for i in temperature_list: temperature_lists.append(eval(i[:-1])) # 将windspeed_list转化为数字 windspeed_lists = [] for i in windspeed_list: windspeed_lists.append(eval(i[:-3])) # 将pressure_list转化为数字 pressure_lists = [] for i in pressure_list: pressure_lists.append(eval(i[:-3])) # 将humidity_list转化为数字 humidity_lists = [] for i in humidity_list: humidity_lists.append(round(eval(i[:-1]) / 100, 3)) # 将cloud_list转化为数字 cloud_lists = [] for i in cloud_list: cloud_lists.append(eval(i[:-1])) return data_list, high_temperature, low_temperature, temperature_lists, \ windspeed_lists, pressure_lists, humidity_lists, cloud_lists # 关键代码(通过城市名字找到cityinfo.xls文件对应url后的对应代码) def weather_data(city): try: data = pd.read_excel('F:\PythonXM\WeatherXM\static\cityinfo.xls', index_col='城市名称') code = data.loc[city]['对应代码'] except Exception as e: print('输入的城市错误,请重新输入!') print(e) sys.exit() # 对应城市的url地址 url = 'https://weather.cma.cn/web/weather/{}.html'.format(code) res = requests.get(url=url, headers=headers).content.decode('utf-8') soup = BeautifulSoup(res, 'lxml') seven_dayList = data1(soup) time_list, temperature_list, rainfall_list, windspeed_list, winddirection_list, \ pressure_list, humidity_list, cloud_list = data2(soup) # 测试数据(上面是所有数据可自行选取,清洗) # print(seven_dayList) # print(time_list) # print(temperature_list) # print(rainfall_list) # print(windspeed_list) # print(winddirection_list) # print(pressure_list) # print(humidity_list) # print(cloud_list) # 只取当日数据和所需表格数据 data_list, high_temperature, low_temperature, temperature_lists, \ windspeed_lists, pressure_lists, humidity_lists, cloud_lists = \ data_change(seven_dayList, temperature_list[0], windspeed_list[0], pressure_list[0], humidity_list[0], cloud_list[0]) # 测试数据 # print(data_list) # print(time_list) # print(high_temperature) # print(low_temperature) # print(temperature_lists) # print(windspeed_lists) # print(pressure_lists) # print(humidity_lists) # print(cloud_lists) return seven_dayList, data_list, high_temperature, low_temperature, time_list[0], temperature_lists, \ windspeed_lists, pressure_lists, humidity_lists, cloud_lists if __name__ == '__main__': weather_data('张家界')
思路:其中weather_data函数为关键,其他的为数据清洗,这里感谢gitee上的一位博主(忘记叫啥了,哈哈)的cityinfo.xls文件,通过输入的城市,找到对应的尾部代号,进而请求该城市页面的数据,没这文件可以使用selenium去获取(这也是一种思路),不过会很慢!!文件在下面!!!!
cityinfo.xls文件:
遇到的问题点:pandas在读取包文件时,使用相对路径'../static/cityinfo.xls',在下面的测试没问题,但当项目运行时老是找不到该文件路径,故写的绝对路径才得以解决,至于原因,暂未知,有知道欢迎在下面讨论!
from pyecharts.globals import SymbolType from pyecharts.options import ComponentTitleOpts from WeatherXM.service.weatherData import weather_data import pyecharts.options as opts from pyecharts.charts import Line, Liquid, EffectScatter, Gauge, Timeline, Scatter, Page, Grid, Pie from pyecharts.components import Table # # 测试数据获取 # seven_dayList, data_list, high_temperature, low_temperature, time_list, temperature_lists, \ # windspeed_lists, pressure_lists, humidity_lists, cloud_lists = weather_data('上海') # 标题 def tab(name, color) -> Pie: # 作为标题 tab = ( Pie(init_opts=opts.InitOpts(width='100%',height='100px')). set_global_opts(title_opts=opts.TitleOpts(title=name, pos_left='center', pos_top='center', title_textstyle_opts=opts.TextStyleOpts(color=color, font_size=35)))) return tab # 需求一:当日温度变化曲线 def temperature_line(hour, temperature) -> Line: line = ( Line(init_opts=opts.InitOpts(width='50%')) .add_xaxis(hour) .add_yaxis("温度", temperature, is_connect_nones=True) .set_series_opts(label_opts=opts.LabelOpts(formatter='{@[1]}℃')) .set_global_opts(title_opts=opts.TitleOpts(title="当日平均温度变化曲线"), yaxis_opts=opts.AxisOpts( type_='value', axislabel_opts=opts.LabelOpts(formatter="{value} ℃"))) ) return line # 需求二:当日平均湿度 def humidity_liquid(humidity) -> Liquid(): liquid = ( Liquid(init_opts=opts.InitOpts(width='50%')) # .add("lq", [0.6, 0.7]) .add("湿度", humidity, is_outline_show=False) .set_global_opts(title_opts=opts.TitleOpts(title="当日平均湿度")) ) return liquid # 需求三:当日风速变化: def windspeed_effectScatter(hour, windspeed) -> EffectScatter: effectScatter = ( EffectScatter(init_opts=opts.InitOpts(width='50%')) .add_xaxis(hour) .add_yaxis("风速", windspeed, symbol=SymbolType.ARROW) .set_series_opts(label_opts=opts.LabelOpts(formatter='{@[1]}m/s')) .set_global_opts(title_opts=opts.TitleOpts(title="当日风速变化"), yaxis_opts=opts.AxisOpts( type_='value', axislabel_opts=opts.LabelOpts(formatter="{value}m/s"))) ) return effectScatter # 需求三:当日最高最低气温变化曲线 def high_low_temperature_line(hour, high_temperature, low_temperature) -> Line: line = ( Line(init_opts=opts.InitOpts(width='50%')) .add_xaxis(xaxis_data=hour) .add_yaxis( series_name="白天气温", y_axis=high_temperature, markpoint_opts=opts.MarkPointOpts( data=[ opts.MarkPointItem(type_="max", name="最大值"), opts.MarkPointItem(type_="min", name="最小值"), ] ), markline_opts=opts.MarkLineOpts( data=[opts.MarkLineItem(type_="average", name="平均值")] ), ) .add_yaxis( series_name="夜晚气温", y_axis=low_temperature, markpoint_opts=opts.MarkPointOpts( data=[ opts.MarkPointItem(type_="max", name="最大值"), opts.MarkPointItem(type_="min", name="最小值"), ] ), markline_opts=opts.MarkLineOpts( data=[opts.MarkLineItem(type_="average", name="平均值")] ), ) .set_series_opts(label_opts=opts.LabelOpts(formatter='{@[1]}℃')) .set_global_opts( title_opts=opts.TitleOpts(title="七日最高最低气温变化曲线"), tooltip_opts=opts.TooltipOpts(trigger="axis"), toolbox_opts=opts.ToolboxOpts(is_show=True), xaxis_opts=opts.AxisOpts(type_="category", boundary_gap=False), yaxis_opts=opts.AxisOpts( type_='value', axislabel_opts=opts.LabelOpts(formatter="{value} ℃")) ) ) return line # 需求四:当日气压 def pressure_gauge(hour, pressure) -> Timeline: tl = Timeline(init_opts=opts.InitOpts(width='50%')) for i in range(len(hour)): gauge = ( Gauge() .set_global_opts(title_opts=opts.TitleOpts(title="气压仪表盘"), legend_opts=opts.LegendOpts(is_show=False)) .add(min_=0, max_=1500, data_pair=[(f'{hour[i]}时气压', pressure[i])], series_name=f'气压', detail_label_opts=opts.GaugeDetailOpts(formatter="{value}hPa")) ) tl.add(gauge, '{}'.format(hour[i])) return tl # 需求五:当日云量散点图 def cloud_scatter(hour, cloud) -> Scatter: scatter = ( Scatter(init_opts=opts.InitOpts(width='50%')) .add_xaxis(hour) .add_yaxis("云量", cloud) .set_series_opts(label_opts=opts.LabelOpts(formatter='{@[1]}%')) .set_global_opts( title_opts=opts.TitleOpts(title="当日云量散点图"), visualmap_opts=opts.VisualMapOpts(type_="size", max_=150, min_=20), yaxis_opts=opts.AxisOpts( type_='value', axislabel_opts=opts.LabelOpts(formatter="{value}%")) ) ) return scatter # 需求六:七日天气预报表格 def seven_day_table(seven_day) -> Table: headers = ['日期', '星期', '白天天气', '白天风向', '白天风力', '白天气温', '晚上气温', '晚上天气', '晚上风向', '晚上风力'] table = ( Table() .add(headers, seven_day) .set_global_opts( title_opts=ComponentTitleOpts(title="七日天气预报") ) ) return table # 需求汇总 def get_chart(city): # 数据获取 seven_dayList, data_list, high_temperature, low_temperature, time_list, temperature_lists, \ windspeed_lists, pressure_lists, humidity_lists, cloud_lists = weather_data(city) page = Page(layout=Page.SimplePageLayout) page.add( tab(f'{city}未来七日天气预报', '#000000'), temperature_line(time_list, temperature_lists), high_low_temperature_line(data_list, high_temperature, low_temperature), humidity_liquid(humidity_lists), pressure_gauge(time_list, pressure_lists), windspeed_effectScatter(time_list, windspeed_lists), cloud_scatter(time_list, cloud_lists), seven_day_table(seven_dayList) ) # page.render("page_draggable_layout.html") return page # def get_chart(city): # # 数据获取 # seven_dayList, data_list, high_temperature, low_temperature, time_list, temperature_lists, \ # windspeed_lists, pressure_lists, humidity_lists, cloud_lists = weather_data(city) # return temperature_line(time_list, temperature_lists), \ # humidity_liquid(humidity_lists), \ # windspeed_effectScatter(time_list, windspeed_lists), \ # high_low_temperature_line(data_list, high_temperature, low_temperature), \ # pressure_gauge(time_list, pressure_lists), \ # cloud_scatter(time_list, cloud_lists), \ # seven_day_table(seven_dayList) if __name__ == '__main__': get_chart('衡阳市')
思路:图表的绘制就没什么可说的了,可自行查阅官方文档:简介 - pyecharts - A Python Echarts Plotting Library built with love.
这里务必使用pyecharts的组合图表将多个图组合在一起返回,不要像get_charts函数一样返回多个图表类!!!
组合图表建议使用这种,当时也使用了Grid,但绘制的图表,位置有问题不规范好看,自己不熟练的问题(教训:多看看官方文档,害我想了好久!!)
from flask import Flask, render_template, request, send_from_directory, render_template_string from pyecharts.render import make_snapshot, snapshot from WeatherXM.service.dataShow import get_chart app = Flask(__name__, static_folder="static", template_folder='templates') @app.route("/") def index(): return render_template("index.html") @app.route("/getdata", methods=['GET']) def get_weather_data(): city = request.args.get('city') page = get_chart(city) # page.render('page.html') # print(a.dump_options_with_quotes()) # return send_from_directory('../', 'page.html') # 成功 # return render_template('page1.html', chart=page.render_embed()) # 失败 chart = page.render_embed() return render_template_string(chart) if __name__ == "__main__": app.run()
Awesome-pyecharts 天气查询:
思路:通过点击搜索,发送数据给后端,后端返回html字符串,前端使用使用html解析
遇到的问题点:这里我使用的是使用 render_template_string函数来渲染简单的模板或动态生成的 HTML 内容,并返回给前端,也可以使用send_from_directory函数返回绘制好的Page.html文件
这里为什么不采用官网的a.dump_options_with_quotes()(本质是将a这个图类转化为HTML)呢?
答:因为不能返回多个HTML,不然前端无法解析,也不能各自转化为json数据,前端接收不到
这里之前我的思路是将多个图类返回,使用a.dump_options_with_quotes()直接返回列表给前端,或使用jsons数据返回,但前端就是接收不到,弄得我好烦,之后看了flask的文档看到了这两个方法!!有兴趣的可以试试其他的!!
思路:这里一定要配置static_folder,template_folder,不然加载不出来的,不要问我为什么,呜呜呜!!
cityinfo.xls文件地址:cityinfo.xls文件
至此,项目就写完了,可以准备启动了
点击右上角的编辑配置,配置flask服务器:
配置好点击运行,点击地址就行了
gitee地址:
python+pyecharts+flask+爬虫实现实时天气查询可视化: 本项目使用python语言编写,采用Flaskweb框架来实现前后端交互,利于开发,维护,前端使用Html和jQuery处理事件,发送数据等,后端采用requests库,BeautifulSoup库实现爬取中国气象局的数据,清洗转化成对应表格数据格式,再使用pyecharts绘制图形,返回给前端页面实现实时展示,注意运行本项目需要联网!!! (gitee.com)https://gitee.com/TheQuietCoder/WeatherXM
整个项目说起来也简单,但其实问题也挺多的,不断遇到bug,不断解决bug,有时候为了解决一个bug,网页都翻烂了都找不到解决方法,这个过程是需要耐心的,当你真正解决了它,回过头看也不过如此,我相信这些bug终将成为你成功的垫脚石,回首望去,轻舟已过万重山!!
项目书写,码字不易,希望各位大佬们能留个赞👍吧!!!