Azahar: pythonでグラフ２

pythonでグラフ１ (別ページ)
１．python install
２．python venv (Virtual Environments)の設定
３．グラフでつかう matplotlibの日本語表示設定

pythonでグラフ２
４．別途、手動で気象庁のcsvデータダウンロード
５，悪運で動いたスクリプト

４．別途、手動で気象庁のcsvデータダウンロード
気象庁の[過去の気象データ・ダウンロード] https://www.data.jma.go.jp/risk/obsdl/index.php から、ダウンロードしたい地点、項目(今回、"時別値:気温")、期間を選んで[CSVファイルをダウンロード]する。

留意：
・1日の境界規定とcsvの記録が違うので、余裕を持った期間長でダウンロードして、後でcalc整形。気象観測統計の解説 https://www.data.jma.go.jp/obd/stats/data/kaisetu/shishin/shishin_all.pdf の「２．３．１合計値」では「1時から24時の24回・・・」だけど、ダウンロードしたcsvでは[00:00:00から23:00:00]が1日になっている。
・1990年から毎正時観測されているが、1989年では 3時間毎のデーターしか無い。古くなれば資料数が減っていく。
・ダウンロードしたcsvは、shift-jis形式。通常の file コマンドだと、'data.csv: Non-ISO extended-ASCII text'。file-kanjiパッケージの漢字コードを調べる file2 コマンドだと、$ file2 data.csv 出力'data.csv: SJIS text'。LibreOffice calcでも開くとき'文字エンコーディング: 日本語 (Shift-JIS)'と認識。
・1990年2月と2018年2月の日数をpython 対話モードで見ると、
>>> import calendar >>> calendar.monthrange(1990,2) (3, 28) ← 月の初日が(0月,1火,2水)3番目の木曜日で始まる28日間 >>> print(calendar.month(2018, 2)) February 2018 Mo Tu We Th Fr Sa Su 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
>>> quit()
どちらも、同じ28日。ということで、 1990年分(1990-01-01 00:00:00 から1990-12-31 23:00:00)を1990data.csv、2018年分を2018data.csvとして /home/$USER/venv/test/data/ にセーブ。 1990data.csv

５，悪運で動いたスクリプト温度データは、さらにtestにdataディレクトリを作って 1990data.csv, 2018data.csvに置いている。 /home/$USER/venv/test/data/1990data.csv /home/$USER/venv/test/data/2018data.csv
ソースは venvの下のtestディレクトリに plot_test.pyとした。
/home/$USER/venv/test/plot_test.py

# coding: utf-8 ← 日本語通すため DebianはLANG=ja_JP.UTF-8 import pandas as pd　　　　← csvに便利なライブラリ import matplotlib.pyplot as plt ← グラフ書くためのライブラリ # matplotlib.use('Agg') ← pngファイル出力savefig()に使う。ここではコメントアウト # 1990年のcsvをshift-jisを読み込んで、いらない行を消して、文字型を日付型と小数点型に設定しなおす f1 = pd.read_csv("data/1990data.csv", encoding="SHIFT-JIS", header=None, names=["1990yr", "1990tempe", "tr1", "tr2"]) df1 = f1.drop(range(0,5)).reset_index(drop=True) df1['1990yr'] = pd.to_datetime(df1['1990yr']) df1['1990tempe'] = df1['1990tempe'].astype(float) # 2018年のcsvの処理。上と同じ f2 = pd.read_csv("data/2018data.csv", encoding="SHIFT-JIS", header=None, names=["2018yr", "2018tempe", "tr1", "tr2"]) df2 = f2.drop(range(0,5)).reset_index(drop=True) df2['2018yr'] = pd.to_datetime(df2['2018yr']) df2['2018tempe'] = df2['2018tempe'].astype(float) # 月毎に温度を合計 df1_new = df1.groupby(df1['1990yr'].dt.month).sum() df2_new = df2.groupby(df2['2018yr'].dt.month).sum() # 1990年に2018年をくっつける df = df1_new.join(df2_new) # print(df) ← 一旦、状況を見た # グラフ項目名など。後はデフォルトで棒グラフを設定 df.plot.bar() plt.legend() plt.title('某地点の毎時温度を月別で合算') plt.xlabel("月") plt.ylabel("℃", rotation=0) # y軸の"℃"が下向きにならないように plt.xticks(rotation=0) # x軸の月目盛りが下を向かないように plt.grid(True) # グラフ表示 plt.show() # plt.savefig("bar.png")　　← pngで出力。showとsavefig両方するにはバックグランド処理が必要？ # ------- end 実行前に、venv環境で matplotlib が入っているかどうか見てみる。
(venv) :~/venv$ pip list ← $ pip show matplotlib の方が直接的
DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.
cycler (0.10.0)
kiwisolver (1.0.1)
matplotlib (3.0.3)
numpy (1.16.2)
pandas (0.24.1)
pip (9.0.1)
pkg-resources (0.0.0)
pyparsing (2.3.1)
python-dateutil (2.8.0)
pytz (2018.9)
setuptools (32.3.1)
six (1.12.0)
なかったら
(venv) :~/venv$ pip install matplotlib
....
プログラム実行
(venv) :~/venv/test$ python plot_test.py ウインドが開いて、グラフが表示される。ウィンドを閉じると、(venv)にプロンプトが戻ってくる。

以上、Do素人のメモ

小数点がついた途端面倒〜！
追加:
s1 = round(df['1990tempe'].sum(), 2)
s2 = round(df['2018tempe'].sum(), 2)
text1 = '1990 total : ' + format(s1)
text2 = '2018 total : ' + format(s2)
と
plt.text(0, 22000, text1, fontsize=10)
plt.text(0, 21000, text2, fontsize=10)
を追加して

2019年3月10日日曜日

pythonでグラフ２