python+pyecharts+flask+爬虫实现实时天气查询可视化
作者:mmseoamin日期:2023-12-21

一、项目简介

项目使用python语言编写,采用Flaskweb框架来实现前后端交互,利于开发,维护,前端使用Html和jQuery处理事件,发送数据等,后端采用requests库,BeautifulSoup库实现爬取中国气象局的数据,清洗转化成对应表格数据格式,再使用pyecharts绘制图形,返回给前端页面实现实时展示,注意运行本项目需要联网!!!

二、项目演示

输入你要查询的城市,点击搜索即可,由于网速,pyecharts的图形渲染等因素,图形展示需等待几秒才出现。

python+pyecharts+flask+爬虫实现实时天气查询可视化,第1张

注意:水球图的渲染有时出不来,可多次点击搜索即可,我是这样的啦!!(与pyecharts图形渲染有关)

python+pyecharts+flask+爬虫实现实时天气查询可视化,第2张

 python+pyecharts+flask+爬虫实现实时天气查询可视化,第3张python+pyecharts+flask+爬虫实现实时天气查询可视化,第4张

 三、项目的实现

 1.项目包结构展示:

 python+pyecharts+flask+爬虫实现实时天气查询可视化,第5张

其中app.py为项目的启动文件及路由,templates包存放前端页面的,service包存放后端逻辑代码venv为排除目录(没啥用可不创建),包可以自己标记为对用的资源目录,我没有使用flask模板创建,而是自己标记的python+pyecharts+flask+爬虫实现实时天气查询可视化,第6张

创建好包就可以书写代码啦!!!

2.Service包代码编写

 2.1.WeatherDate(爬取天气数据)
import sys
import pandas as pd
import requests
from bs4 import BeautifulSoup
# 请求头可写,但我看没报错,就没写
headers = {
    'user-agent': '',
    'Cookie': ''
}
# 列表划分,例如[1,2,3,4]=>[[1,2],[3,4]],目的适应pyecharts数据
def chunk_list(lst, size):
    return [lst[i:i + size] for i in range(0, len(lst), size)]
# 数据图一的部分数据列表
def data1(soup):
    seven_dayList = [i.text.strip().replace(' ', '').replace('\n', '') for i in soup.select('.day-item')]
    temp = []
    # 数据清洗
    j = 0
    for i in seven_dayList:
        if j % 10 == 0:
            temp.append('2023年' + i[3:5] + '月' + i[6:] + '日')
            temp.append(i[:3])
        if j % 10 == 5:
            temp.append(i[:3])
            temp.append(i[3:])
        if i != '' and j % 10 != 0 and j % 10 != 5:
            temp.append(i)
        j += 1
    # 数据格式对应表格
    seven_dayLists = chunk_list(temp, 10)
    return seven_dayLists
# 图二表格数据列表
def data2(soup):
    total_list = [i.text.strip() for i in soup.select('.hour-table td')]
    temp = []
    for i in total_list:
        if i != '' and i != '天气':
            temp.append(i)
    # 时间
    time_list = []
    # 气温
    temperature_list = []
    # 降水
    rainfall_list = []
    # 风速
    windspeed_list = []
    # 风向
    winddirection_list = []
    # 气压
    pressure_list = []
    # 湿度
    humidity_list = []
    # 云量
    cloud_list = []
    categories = ['时间', '气温', '降水', '风速', '风向', '气压', '湿度', '云量']
    current_category = None
    for item in temp:
        if item in categories:
            current_category = item
        else:
            if current_category == '时间':
                time_list.append(item)
            elif current_category == '气温':
                temperature_list.append(item)
            elif current_category == '降水':
                rainfall_list.append(item)
            elif current_category == '风速':
                windspeed_list.append(item)
            elif current_category == '风向':
                winddirection_list.append(item)
            elif current_category == '气压':
                pressure_list.append(item)
            elif current_category == '湿度':
                humidity_list.append(item)
            elif current_category == '云量':
                cloud_list.append(item)
    # 切割
    return chunk_list(time_list, 8), chunk_list(temperature_list, 8), chunk_list(rainfall_list, 8), \
           chunk_list(windspeed_list, 8), chunk_list(winddirection_list, 8), chunk_list(pressure_list, 8), \
           chunk_list(humidity_list, 8), chunk_list(cloud_list, 8)
# 数据转换(表格所需)
def data_change(seven_dayList, temperature_list, windspeed_list, pressure_list, humidity_list, cloud_list):
    # 将seven_dayList切出最高气温和最低气温
    data_list = []
    high_temperature = []
    low_temperature = []
    for i in seven_dayList:
        data_list.append(i[0])
        high_temperature.append(eval(i[5][:-1]))
        low_temperature.append(eval(i[6][:-1]))
    # 将temperature_list转化为数字
    temperature_lists = []
    for i in temperature_list:
        temperature_lists.append(eval(i[:-1]))
    # 将windspeed_list转化为数字
    windspeed_lists = []
    for i in windspeed_list:
        windspeed_lists.append(eval(i[:-3]))
    # 将pressure_list转化为数字
    pressure_lists = []
    for i in pressure_list:
        pressure_lists.append(eval(i[:-3]))
    # 将humidity_list转化为数字
    humidity_lists = []
    for i in humidity_list:
        humidity_lists.append(round(eval(i[:-1]) / 100, 3))
    # 将cloud_list转化为数字
    cloud_lists = []
    for i in cloud_list:
        cloud_lists.append(eval(i[:-1]))
    return data_list, high_temperature, low_temperature, temperature_lists, \
           windspeed_lists, pressure_lists, humidity_lists, cloud_lists
# 关键代码(通过城市名字找到cityinfo.xls文件对应url后的对应代码)
def weather_data(city):
    try:
        data = pd.read_excel('F:\PythonXM\WeatherXM\static\cityinfo.xls', index_col='城市名称')
        code = data.loc[city]['对应代码']
    except Exception as e:
        print('输入的城市错误,请重新输入!')
        print(e)
        sys.exit()
    # 对应城市的url地址
    url = 'https://weather.cma.cn/web/weather/{}.html'.format(code)
    res = requests.get(url=url, headers=headers).content.decode('utf-8')
    soup = BeautifulSoup(res, 'lxml')
    seven_dayList = data1(soup)
    time_list, temperature_list, rainfall_list, windspeed_list, winddirection_list, \
    pressure_list, humidity_list, cloud_list = data2(soup)
    # 测试数据(上面是所有数据可自行选取,清洗)
    # print(seven_dayList)
    # print(time_list)
    # print(temperature_list)
    # print(rainfall_list)
    # print(windspeed_list)
    # print(winddirection_list)
    # print(pressure_list)
    # print(humidity_list)
    # print(cloud_list)
    # 只取当日数据和所需表格数据
    data_list, high_temperature, low_temperature, temperature_lists, \
    windspeed_lists, pressure_lists, humidity_lists, cloud_lists = \
        data_change(seven_dayList, temperature_list[0], windspeed_list[0],
                    pressure_list[0], humidity_list[0], cloud_list[0])
    # 测试数据
    # print(data_list)
    # print(time_list)
    # print(high_temperature)
    # print(low_temperature)
    # print(temperature_lists)
    # print(windspeed_lists)
    # print(pressure_lists)
    # print(humidity_lists)
    # print(cloud_lists)
    return seven_dayList, data_list, high_temperature, low_temperature, time_list[0], temperature_lists, \
           windspeed_lists, pressure_lists, humidity_lists, cloud_lists
if __name__ == '__main__':
    weather_data('张家界')

思路:其中weather_data函数为关键,其他的为数据清洗,这里感谢gitee上的一位博主(忘记叫啥了,哈哈)的cityinfo.xls文件,通过输入的城市,找到对应的尾部代号,进而请求该城市页面的数据,没这文件可以使用selenium去获取(这也是一种思路),不过会很慢!!文件在下面!!!!

python+pyecharts+flask+爬虫实现实时天气查询可视化,第7张

cityinfo.xls文件:

 python+pyecharts+flask+爬虫实现实时天气查询可视化,第8张

遇到的问题点:pandas在读取包文件时,使用相对路径'../static/cityinfo.xls',在下面的测试没问题,但当项目运行时老是找不到该文件路径,故写的绝对路径才得以解决,至于原因,暂未知,有知道欢迎在下面讨论!

 2.2.DataShow(数据的可视化图表)
from pyecharts.globals import SymbolType
from pyecharts.options import ComponentTitleOpts
from WeatherXM.service.weatherData import weather_data
import pyecharts.options as opts
from pyecharts.charts import Line, Liquid, EffectScatter, Gauge, Timeline, Scatter, Page, Grid, Pie
from pyecharts.components import Table
# # 测试数据获取
# seven_dayList, data_list, high_temperature, low_temperature, time_list, temperature_lists, \
# windspeed_lists, pressure_lists, humidity_lists, cloud_lists = weather_data('上海')
# 标题
def tab(name, color) -> Pie:  # 作为标题
    tab = (
        Pie(init_opts=opts.InitOpts(width='100%',height='100px')).
        set_global_opts(title_opts=opts.TitleOpts(title=name, pos_left='center', pos_top='center',
   title_textstyle_opts=opts.TextStyleOpts(color=color, font_size=35))))
    return tab
# 需求一:当日温度变化曲线
def temperature_line(hour, temperature) -> Line:
    line = (
        Line(init_opts=opts.InitOpts(width='50%'))
        .add_xaxis(hour)
        .add_yaxis("温度", temperature, is_connect_nones=True)
        .set_series_opts(label_opts=opts.LabelOpts(formatter='{@[1]}℃'))
        .set_global_opts(title_opts=opts.TitleOpts(title="当日平均温度变化曲线"), yaxis_opts=opts.AxisOpts(
            type_='value',
            axislabel_opts=opts.LabelOpts(formatter="{value} ℃")))
    )
    return line
# 需求二:当日平均湿度
def humidity_liquid(humidity) -> Liquid():
    liquid = (
        Liquid(init_opts=opts.InitOpts(width='50%'))
        # .add("lq", [0.6, 0.7])
        .add("湿度", humidity, is_outline_show=False)
        .set_global_opts(title_opts=opts.TitleOpts(title="当日平均湿度"))
    )
    return liquid
# 需求三:当日风速变化:
def windspeed_effectScatter(hour, windspeed) -> EffectScatter:
    effectScatter = (
        EffectScatter(init_opts=opts.InitOpts(width='50%'))
        .add_xaxis(hour)
        .add_yaxis("风速", windspeed, symbol=SymbolType.ARROW)
        .set_series_opts(label_opts=opts.LabelOpts(formatter='{@[1]}m/s'))
        .set_global_opts(title_opts=opts.TitleOpts(title="当日风速变化"), yaxis_opts=opts.AxisOpts(
            type_='value',
            axislabel_opts=opts.LabelOpts(formatter="{value}m/s")))
    )
    return effectScatter
# 需求三:当日最高最低气温变化曲线
def high_low_temperature_line(hour, high_temperature, low_temperature) -> Line:
    line = (
        Line(init_opts=opts.InitOpts(width='50%'))
        .add_xaxis(xaxis_data=hour)
        .add_yaxis(
            series_name="白天气温",
            y_axis=high_temperature,
            markpoint_opts=opts.MarkPointOpts(
                data=[
                    opts.MarkPointItem(type_="max", name="最大值"),
                    opts.MarkPointItem(type_="min", name="最小值"),
                ]
            ),
            markline_opts=opts.MarkLineOpts(
                data=[opts.MarkLineItem(type_="average", name="平均值")]
            ),
        )
        .add_yaxis(
            series_name="夜晚气温",
            y_axis=low_temperature,
            markpoint_opts=opts.MarkPointOpts(
                data=[
                    opts.MarkPointItem(type_="max", name="最大值"),
                    opts.MarkPointItem(type_="min", name="最小值"),
                ]
            ),
            markline_opts=opts.MarkLineOpts(
                data=[opts.MarkLineItem(type_="average", name="平均值")]
            ),
        )
        .set_series_opts(label_opts=opts.LabelOpts(formatter='{@[1]}℃'))
        .set_global_opts(
            title_opts=opts.TitleOpts(title="七日最高最低气温变化曲线"),
            tooltip_opts=opts.TooltipOpts(trigger="axis"),
            toolbox_opts=opts.ToolboxOpts(is_show=True),
            xaxis_opts=opts.AxisOpts(type_="category", boundary_gap=False),
            yaxis_opts=opts.AxisOpts(
                type_='value',
                axislabel_opts=opts.LabelOpts(formatter="{value} ℃"))
        )
    )
    return line
# 需求四:当日气压
def pressure_gauge(hour, pressure) -> Timeline:
    tl = Timeline(init_opts=opts.InitOpts(width='50%'))
    for i in range(len(hour)):
        gauge = (
            Gauge()
            .set_global_opts(title_opts=opts.TitleOpts(title="气压仪表盘"),
                             legend_opts=opts.LegendOpts(is_show=False))
            .add(min_=0, max_=1500, data_pair=[(f'{hour[i]}时气压', pressure[i])], series_name=f'气压',
                 detail_label_opts=opts.GaugeDetailOpts(formatter="{value}hPa"))
        )
        tl.add(gauge, '{}'.format(hour[i]))
    return tl
# 需求五:当日云量散点图
def cloud_scatter(hour, cloud) -> Scatter:
    scatter = (
        Scatter(init_opts=opts.InitOpts(width='50%'))
        .add_xaxis(hour)
        .add_yaxis("云量", cloud)
        .set_series_opts(label_opts=opts.LabelOpts(formatter='{@[1]}%'))
        .set_global_opts(
            title_opts=opts.TitleOpts(title="当日云量散点图"),
            visualmap_opts=opts.VisualMapOpts(type_="size", max_=150, min_=20),
            yaxis_opts=opts.AxisOpts(
                type_='value',
                axislabel_opts=opts.LabelOpts(formatter="{value}%"))
        )
    )
    return scatter
# 需求六:七日天气预报表格
def seven_day_table(seven_day) -> Table:
    headers = ['日期', '星期', '白天天气', '白天风向', '白天风力', '白天气温', '晚上气温', '晚上天气', '晚上风向', '晚上风力']
    table = (
        Table()
        .add(headers, seven_day)
        .set_global_opts(
            title_opts=ComponentTitleOpts(title="七日天气预报")
        )
    )
    return table
# 需求汇总
def get_chart(city):
    # 数据获取
    seven_dayList, data_list, high_temperature, low_temperature, time_list, temperature_lists, \
    windspeed_lists, pressure_lists, humidity_lists, cloud_lists = weather_data(city)
    page = Page(layout=Page.SimplePageLayout)
    page.add(
        tab(f'{city}未来七日天气预报', '#000000'),
        temperature_line(time_list, temperature_lists),
        high_low_temperature_line(data_list, high_temperature, low_temperature),
        humidity_liquid(humidity_lists),
        pressure_gauge(time_list, pressure_lists),
        windspeed_effectScatter(time_list, windspeed_lists),
        cloud_scatter(time_list, cloud_lists),
        seven_day_table(seven_dayList)
    )
    # page.render("page_draggable_layout.html")
    return page
# def get_chart(city):
#     # 数据获取
#     seven_dayList, data_list, high_temperature, low_temperature, time_list, temperature_lists, \
#     windspeed_lists, pressure_lists, humidity_lists, cloud_lists = weather_data(city)
#     return temperature_line(time_list, temperature_lists), \
#            humidity_liquid(humidity_lists), \
#            windspeed_effectScatter(time_list, windspeed_lists), \
#            high_low_temperature_line(data_list, high_temperature, low_temperature), \
#            pressure_gauge(time_list, pressure_lists), \
#            cloud_scatter(time_list, cloud_lists), \
#            seven_day_table(seven_dayList)
if __name__ == '__main__':
    get_chart('衡阳市')

思路:图表的绘制就没什么可说的了,可自行查阅官方文档:简介 - pyecharts - A Python Echarts Plotting Library built with love.

这里务必使用pyecharts的组合图表将多个图组合在一起返回,不要像get_charts函数一样返回多个图表类!!!

python+pyecharts+flask+爬虫实现实时天气查询可视化,第9张

组合图表建议使用这种,当时也使用了Grid,但绘制的图表,位置有问题不规范好看,自己不熟练的问题(教训:多看看官方文档,害我想了好久!!)

python+pyecharts+flask+爬虫实现实时天气查询可视化,第10张

 3.templates包前端代码以及app.py文件的书写

3.1.app.py文件的书写
from flask import Flask, render_template, request, send_from_directory, render_template_string
from pyecharts.render import make_snapshot, snapshot
from WeatherXM.service.dataShow import get_chart
app = Flask(__name__, static_folder="static", template_folder='templates')
@app.route("/")
def index():
    return render_template("index.html")
@app.route("/getdata", methods=['GET'])
def get_weather_data():
    city = request.args.get('city')
    page = get_chart(city)
    # page.render('page.html')
    # print(a.dump_options_with_quotes())
    # return send_from_directory('../', 'page.html') # 成功
    
    # return render_template('page1.html', chart=page.render_embed()) # 失败
    chart = page.render_embed()
    
    return render_template_string(chart)
if __name__ == "__main__":
    app.run()
3.2.index.html文件的书写 



    
    Awesome-pyecharts


    
    
    
    
    
    
    


    天气查询:
    
    





思路:通过点击搜索,发送数据给后端,后端返回html字符串,前端使用使用html解析

遇到的问题点:这里我使用的是使用 render_template_string函数来渲染简单的模板或动态生成的 HTML 内容,并返回给前端,也可以使用send_from_directory函数返回绘制好的Page.html文件

这里为什么不采用官网的a.dump_options_with_quotes()(本质是将a这个图类转化为HTML)呢?

答:因为不能返回多个HTML,不然前端无法解析,也不能各自转化为json数据,前端接收不到

这里之前我的思路是将多个图类返回,使用a.dump_options_with_quotes()直接返回列表给前端,或使用jsons数据返回,但前端就是接收不到,弄得我好烦,之后看了flask的文档看到了这两个方法!!有兴趣的可以试试其他的!!

4.static(资源文件)

python+pyecharts+flask+爬虫实现实时天气查询可视化,第11张

 思路:这里一定要配置static_folder,template_folder,不然加载不出来的,不要问我为什么,呜呜呜!!

cityinfo.xls文件地址:cityinfo.xls文件

python+pyecharts+flask+爬虫实现实时天气查询可视化,第12张

至此,项目就写完了,可以准备启动了

 四、项目启动

点击右上角的编辑配置,配置flask服务器:

python+pyecharts+flask+爬虫实现实时天气查询可视化,第13张

python+pyecharts+flask+爬虫实现实时天气查询可视化,第14张

 配置好点击运行,点击地址就行了

python+pyecharts+flask+爬虫实现实时天气查询可视化,第15张

gitee地址:
python+pyecharts+flask+爬虫实现实时天气查询可视化: 本项目使用python语言编写,采用Flaskweb框架来实现前后端交互,利于开发,维护,前端使用Html和jQuery处理事件,发送数据等,后端采用requests库,BeautifulSoup库实现爬取中国气象局的数据,清洗转化成对应表格数据格式,再使用pyecharts绘制图形,返回给前端页面实现实时展示,注意运行本项目需要联网!!! (gitee.com)python+pyecharts+flask+爬虫实现实时天气查询可视化,第16张https://gitee.com/TheQuietCoder/WeatherXM

五、项目心得

整个项目说起来也简单,但其实问题也挺多的,不断遇到bug,不断解决bug,有时候为了解决一个bug,网页都翻烂了都找不到解决方法,这个过程是需要耐心的,当你真正解决了它,回过头看也不过如此,我相信这些bug终将成为你成功的垫脚石,回首望去,轻舟已过万重山!!

项目书写,码字不易,希望各位大佬们能留个赞👍吧!!!