1
0
Files
finance/财报筛选/金字塔选股.ipynb
2025-01-20 00:49:23 +08:00

1000 lines
39 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "code",
"id": "initial_id",
"metadata": {
"collapsed": true,
"ExecuteTime": {
"end_time": "2025-01-19T16:42:26.631868Z",
"start_time": "2025-01-19T16:42:26.628635Z"
}
},
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import tushare as ts\n",
"\n",
"ts_pro = ts.pro_api(token=\"64ebff4fa679167600b905ee45dd88e76f3963c0ff39157f3f085f0e\")"
],
"outputs": [],
"execution_count": 459
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T16:42:27.857016Z",
"start_time": "2025-01-19T16:42:26.639161Z"
}
},
"cell_type": "code",
"source": [
"import loader as loader\n",
"\n",
"# 加载财报信息\n",
"source_finance_df = loader.load_finance()\n",
"finance_df = pd.DataFrame()\n",
"finance_df[[\n",
" \"code\",\n",
" # 年份\n",
" \"year\",\n",
" # 股东权益合计(含少数股东权益)\n",
" \"total_stockholder_interest\",\n",
" # 净利润\n",
" \"net_income\",\n",
" # 总资产\n",
" \"total_assets\",\n",
" # 营业总收入\n",
" \"total_revenue\",\n",
" # 存货\n",
" \"inventories\",\n",
" # 应收账款\n",
" \"accounts_receivable\",\n",
" # 营业成本\n",
" \"operating_costs\",\n",
" # 营业利润\n",
" \"operating_profit\",\n",
" # 现金与现金等价物\n",
" \"cash\",\n",
" # 营业活动现金流量净值\n",
" \"operating_net_cash_flow\",\n",
"]] = source_finance_df[[\n",
" \"ts_code\",\n",
" \"end_date\",\n",
" \"total_hldr_eqy_inc_min_int\",\n",
" \"n_income\",\n",
" \"total_assets\",\n",
" \"total_revenue\",\n",
" \"inventories\",\n",
" \"accounts_receiv\",\n",
" \"oper_cost\",\n",
" \"operate_profit\",\n",
" \"money_cap\",\n",
" \"n_cashflow_act\",\n",
"]]\n",
"finance_df[\"score\"] = 0\n",
"finance_df = finance_df.sort_values(by=[\"code\", \"year\"], ascending=True).reset_index(drop=True)"
],
"id": "68b2debc14502fd5",
"outputs": [],
"execution_count": 460
},
{
"metadata": {},
"cell_type": "markdown",
"source": "过滤上市时间大于5年的企业由于计算很多数值需要与前一年比所以实际上需要的是上市时间至少6年的企业",
"id": "3052b8c4442f3643"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T16:42:27.931899Z",
"start_time": "2025-01-19T16:42:27.868963Z"
}
},
"cell_type": "code",
"source": "finance_df = finance_df.groupby(\"code\").filter(lambda x: len(x) > 6)",
"id": "4293bd93ea8f9ed",
"outputs": [],
"execution_count": 461
},
{
"metadata": {},
"cell_type": "markdown",
"source": [
"ROE\n",
"\n",
"股东权益报酬率(%) RoE\t股东权益报酬率RoE 是判断股票是否具备”长期上涨潜力“最重要的指标之一, RoE 如果既稳定又优秀,那这家公司的其他指标也基本不会差。 \n",
"RoE均值 >= 35\t550分 \n",
"35 > RoE均值 >= 30\t500分 \n",
"30 > RoE均值 >= 25\t450分 \n",
"25 > RoE均值 >= 20\t400分 \n",
"20 > RoE均值 >= 15\t300分 \n",
"15 > RoE均值 >= 10\t250分 \n",
"10 > RoE均值 或 0 >= 其中一年\t0分 "
],
"id": "336aa970ee46709d"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T16:42:29.192712Z",
"start_time": "2025-01-19T16:42:27.946033Z"
}
},
"cell_type": "code",
"source": [
"finance_df[\"prev_total_stockholder_interest\"] = finance_df.groupby(\"code\")[\"total_stockholder_interest\"].shift(1)\n",
"finance_df[\"roe\"] = finance_df[\"net_income\"] / (\n",
" (finance_df[\"prev_total_stockholder_interest\"] + finance_df[\"total_stockholder_interest\"]) / 2)\n",
"finance_df[\"average_roe\"] = finance_df.groupby(\"code\")[\"roe\"].rolling(window=5).mean().reset_index(0, drop=True)\n",
"finance_df[\"score_roe\"] = 0\n",
"finance_df.loc[finance_df[\"average_roe\"] >= 0.35, \"score_roe\"] += 550\n",
"finance_df.loc[(finance_df[\"average_roe\"] >= 0.30) & (finance_df[\"average_roe\"] < 0.35), \"score_roe\"] = 500\n",
"finance_df.loc[(finance_df[\"average_roe\"] >= 0.25) & (finance_df[\"average_roe\"] < 0.30), \"score_roe\"] = 450\n",
"finance_df.loc[(finance_df[\"average_roe\"] >= 0.20) & (finance_df[\"average_roe\"] < 0.25), \"score_roe\"] = 400\n",
"finance_df.loc[(finance_df[\"average_roe\"] >= 0.15) & (finance_df[\"average_roe\"] < 0.20), \"score_roe\"] = 300\n",
"finance_df.loc[(finance_df[\"average_roe\"] >= 0.10) & (finance_df[\"average_roe\"] < 0.15), \"score_roe\"] = 250\n",
"finance_df[\"score\"] += finance_df[\"score_roe\"]\n",
"\n",
"\n",
"def reset_score_for_average_roe(group):\n",
" group.loc[group['average_roe'].rolling(window=5, min_periods=1).min() < 0, 'score'] = 0\n",
" return group\n",
"\n",
"\n",
"finance_df = finance_df.groupby(\"code\").apply(reset_score_for_average_roe).reset_index(drop=True)"
],
"id": "f050d33c4a0cd720",
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\lanyuanxiaoyao\\AppData\\Local\\Temp\\ipykernel_28824\\1604170078.py:20: DeprecationWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.\n",
" finance_df = finance_df.groupby(\"code\").apply(reset_score_for_average_roe).reset_index(drop=True)\n"
]
}
],
"execution_count": 462
},
{
"metadata": {},
"cell_type": "markdown",
"source": [
"总资产报酬率(%) RoA\t总资产报酬率RoA 能真实反映出一家公司的获利能力, 相当于一家公司的 「投资年化率」,当然越高越好。\n",
"RoA均值 >= 15\t100分\n",
"15 > RoA均值 >= 11\t80分\n",
"11 > RoA均值 >= 7\t50分\n",
"7 > RoA均值\t0分"
],
"id": "6c2e1c602898e342"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T16:42:29.363249Z",
"start_time": "2025-01-19T16:42:29.219453Z"
}
},
"cell_type": "code",
"source": [
"finance_df[\"prev_total_assets\"] = finance_df.groupby(\"code\")[\"total_assets\"].shift(1)\n",
"finance_df[\"roa\"] = finance_df[\"net_income\"] / ((finance_df[\"prev_total_assets\"] + finance_df[\"total_assets\"]) / 2)\n",
"finance_df[\"average_roa\"] = finance_df.groupby(\"code\")[\"roa\"].rolling(window=5).mean().reset_index(0, drop=True)\n",
"finance_df[\"score_roa\"] = 0\n",
"finance_df.loc[finance_df[\"average_roa\"] >= 0.15, \"score_roa\"] += 100\n",
"finance_df.loc[(finance_df[\"average_roa\"] >= 0.11) & (finance_df[\"average_roa\"] < 0.15), \"score_roa\"] += 80\n",
"finance_df.loc[(finance_df[\"average_roa\"] >= 0.07) & (finance_df[\"average_roa\"] < 0.11), \"score_roa\"] += 50\n",
"finance_df[\"score\"] += finance_df[\"score_roa\"]"
],
"id": "3b585cf8e2eb5e3",
"outputs": [],
"execution_count": 463
},
{
"metadata": {},
"cell_type": "markdown",
"source": [
"税后净利 规模(百万)\t获利规模大那股价的”长期上涨潜力“也就会越稳健、越确定。选择时大公司的优先级要高于小公司。\n",
"税后净利均值 >= 10000\t150分\n",
"10000 > 税后净利均值 >= 1000\t100分\n",
"1000 > 税后净利均值\t0分"
],
"id": "dbcb1d6b23582f3e"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T16:42:29.878285Z",
"start_time": "2025-01-19T16:42:29.369862Z"
}
},
"cell_type": "code",
"source": [
"finance_df['average_net_income'] = finance_df.groupby('code')['net_income'].transform(lambda x: x.rolling(5).mean())\n",
"finance_df[\"score_net_income\"] = 0\n",
"finance_df.loc[finance_df[\"average_net_income\"] >= 10000 * 10000000, \"score_net_income\"] = 150\n",
"finance_df.loc[(finance_df[\"average_net_income\"] >= 1000 * 10000000) & (\n",
" finance_df[\"average_net_income\"] < 10000 * 10000000), \"score_net_income\"] = 100\n",
"finance_df[\"score\"] += finance_df[\"score_net_income\"]"
],
"id": "fd5582e080102e20",
"outputs": [],
"execution_count": 464
},
{
"metadata": {},
"cell_type": "markdown",
"source": [
"现金状况 分析\t总资产周转率 与 现金与约当现金占总资产的关系,可以看出公司的现金状况是否健康。\n",
"规则①:总资产周转率 > 0.8 且 现金与约当现金占总资产(%) >= 10\t50分\n",
"规则②:总资产周转率 < 0.8 且 现金与约当现金占总资产(%) >= 20\t50分\n",
"不符合 规则① 或 规则②\t0分"
],
"id": "2c6c934a376eef0d"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T16:42:29.905727Z",
"start_time": "2025-01-19T16:42:29.889468Z"
}
},
"cell_type": "code",
"source": [
"finance_df[\"prev_total_assets\"] = finance_df.groupby(\"code\")[\"total_assets\"].shift(1)\n",
"finance_df[\"total_assets_turnover_ratio\"] = finance_df[\"total_revenue\"] / (\n",
" (finance_df[\"prev_total_assets\"] + finance_df[\"total_assets\"]) / 2)\n",
"\n",
"finance_df[\"cash_ratio\"] = finance_df[\"cash\"] / finance_df[\"total_assets\"]\n",
"\n",
"finance_df[\"score_assets_turnover_and_cash\"] = 0\n",
"finance_df.loc[(finance_df[\"total_assets_turnover_ratio\"] >= 0.8) & (\n",
" finance_df[\"cash_ratio\"] >= 0.1), \"score_assets_turnover_and_cash\"] = 50\n",
"finance_df.loc[(finance_df[\"total_assets_turnover_ratio\"] < 0.8) & (\n",
" finance_df[\"cash_ratio\"] >= 0.2), \"score_assets_turnover_and_cash\"] = 50\n",
"finance_df[\"score\"] += finance_df[\"score_assets_turnover_and_cash\"]"
],
"id": "bc92e050c82c3768",
"outputs": [],
"execution_count": 465
},
{
"metadata": {},
"cell_type": "markdown",
"source": [
"收现日数(日)\t平均收现日数越短说明公司的经营能力越强。\n",
"30 >= 平均收现日数\t20分\n",
"平均收现日数 > 30\t0分\n",
"\n",
"销货日数(日)\t平均销货日数越短说明公司商品越畅销。\n",
"30 >= 平均销货日数\t20分\n",
"平均销货日数 > 30\t0分\n",
"\n",
"收现日数+销货日数(日)\t生意周期 是否足够优秀,越短说明每年能做的生意趟数越多,经营能力就越强。\n",
"40 >= 收现日数+销货日数\t20分\n",
"60 >= 收现日数+销货日数 > 40\t10分\n",
"平均销货日数 > 60\t0分"
],
"id": "41ac526b0296b44b"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T16:42:29.923829Z",
"start_time": "2025-01-19T16:42:29.916781Z"
}
},
"cell_type": "code",
"source": [
"# 收现日数\n",
"finance_df[\"collection_cash_period\"] = 360 / (finance_df[\"total_revenue\"] / finance_df[\"accounts_receivable\"])\n",
"# 销货日数\n",
"finance_df[\"sales_period\"] = 360 / (finance_df[\"operating_costs\"] / finance_df[\"accounts_receivable\"])\n",
"\n",
"finance_df[\"score_collection_cash_period\"] = 0\n",
"finance_df.loc[(not finance_df[\"score_collection_cash_period\"].isna) & (\n",
" finance_df[\"collection_cash_period\"] < 30), \"score_collection_cash_period\"] = 20\n",
"finance_df[\"score\"] += finance_df[\"score_collection_cash_period\"]\n",
"\n",
"finance_df[\"score_sales_period\"] = 0\n",
"finance_df.loc[\n",
" (not finance_df[\"score_sales_period\"].isna) & (finance_df[\"score_sales_period\"] < 30), \"score_sales_period\"] = 20\n",
"finance_df[\"score\"] += finance_df[\"score_sales_period\"]\n",
"\n",
"finance_df[\"score_collection_cash_period_and_sales_period\"] = 0\n",
"finance_df.loc[(finance_df[\"collection_cash_period\"] + finance_df[\n",
" \"sales_period\"]) < 40, \"score_collection_cash_period_and_sales_period\"] = 20\n",
"finance_df.loc[(finance_df[\"collection_cash_period\"] + finance_df[\n",
" \"sales_period\"]) < 60, \"score_collection_cash_period_and_sales_period\"] = 10\n",
"finance_df[\"score\"] += finance_df[\"score_collection_cash_period_and_sales_period\"]"
],
"id": "baeb44a4fb28b60b",
"outputs": [],
"execution_count": 466
},
{
"metadata": {},
"cell_type": "markdown",
"source": [
"毛利率(%)\t毛利率 是否保持平稳,不大起大落。\n",
"30% >= 毛利率平均波动幅度\t50分\n",
"毛利率平均波动幅度 > 30%\t0分"
],
"id": "a53bfdb799df8f48"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T16:42:30.005541Z",
"start_time": "2025-01-19T16:42:29.935348Z"
}
},
"cell_type": "code",
"source": [
"finance_df[\"gross_profit_ratio\"] = (finance_df[\"total_revenue\"] - finance_df[\"operating_costs\"]) / finance_df[\n",
" \"total_revenue\"]\n",
"finance_df[\"gross_profit_ratio_std\"] = finance_df.groupby(\"code\")[\"gross_profit_ratio\"].rolling(\n",
" window=5).std().reset_index(0, drop=True)\n",
"finance_df[\"score_gross_profit_ratio\"] = 0\n",
"finance_df[(not finance_df[\"gross_profit_ratio_std\"].isna) & (finance_df[\"gross_profit_ratio_std\"] < 30)] = 50\n",
"finance_df[\"score\"] += finance_df[\"score_gross_profit_ratio\"]"
],
"id": "9d23bdf60a1839c9",
"outputs": [],
"execution_count": 467
},
{
"metadata": {},
"cell_type": "markdown",
"source": [
"经营安全边际率(%)\t经营安全边际率越高说明公司能够很好的控制成本把大部分利润转化成实际收益 获利能力很强,有利于抵御经济波动和价格竞争等影响。\n",
"经营安全边际率 >= 70\t50分\n",
"70 > 经营安全边际率 >= 50\t30分\n",
"50 > 经营安全边际率 >= 30\t10分\n",
"30 > 经营安全边际率\t0分"
],
"id": "6613ab8263d4afef"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T16:42:30.022774Z",
"start_time": "2025-01-19T16:42:30.016550Z"
}
},
"cell_type": "code",
"source": [
"finance_df[\"operating_profit_ratio\"] = finance_df[\"operating_profit\"] / finance_df[\"total_revenue\"]\n",
"finance_df[\"operating_safety_margin\"] = finance_df[\"operating_profit_ratio\"] / finance_df[\"gross_profit_ratio\"]\n",
"finance_df[\"score_operating_safety_margin\"] = 0\n",
"finance_df.loc[finance_df[\"operating_safety_margin\"] >= 70, \"score_operating_safety_margin\"] = 50\n",
"finance_df.loc[finance_df[\"operating_safety_margin\"] >= 50, \"score_operating_safety_margin\"] = 30\n",
"finance_df.loc[finance_df[\"operating_safety_margin\"] >= 30, \"score_operating_safety_margin\"] = 10\n",
"finance_df[\"score\"] += finance_df[\"score_operating_safety_margin\"]"
],
"id": "f7d9486af89cb710",
"outputs": [],
"execution_count": 468
},
{
"metadata": {},
"cell_type": "markdown",
"source": [
"税后净利\n",
"年份由近(A)~远(E)\t公司赚的钱是否在逐年增长是股价能长期上涨的基础。\n",
"A年 > B年 (+) / A年 < B年 (-)\t+30分 / -30分\n",
"B年 > C年 (+) / B年 < C年 (-)\t+25分 / -25分\n",
"C年 > D年 (+) / C年 < D年 (-)\t+20分 / -20分\n",
"D年 > E年 (+) / D年 < E年 (-)\t+15分 / -15分"
],
"id": "2ee3712525455954"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-01-19T16:42:46.378919Z",
"start_time": "2025-01-19T16:42:30.718209Z"
}
},
"cell_type": "code",
"source": [
"def score_by_net_income_ascending(group):\n",
" # 计算 score_net_income_1\n",
" group['net_income_shift_1'] = group['net_income'].shift(1)\n",
" group['score_net_income_1'] = (group['net_income'] > group['net_income_shift_1']).map(\n",
" {True: 30, False: -30})\n",
" group = group.mask(pd.isna(group[\"net_income_shift_1\"]), other=0)\n",
" # 计算 score_net_income_2\n",
" group['net_income_shift_2'] = group['net_income'].shift(2)\n",
" group['score_net_income_2'] = (group['net_income_shift_1'] > group['net_income_shift_2']).map(\n",
" {True: 25, False: -25})\n",
" # 计算 score_net_income_3\n",
" group['net_income_shift_3'] = group['net_income'].shift(3)\n",
" group['score_net_income_3'] = (group['net_income_shift_2'] > group['net_income_shift_3']).map(\n",
" {True: 20, False: -20})\n",
" # 计算 score_net_income_4\n",
" group['net_income_shift_4'] = group['net_income'].shift(4)\n",
" group['score_net_income_4'] = (group['net_income_shift_3'] > group['net_income_shift_4']).map(\n",
" {True: 15, False: -15}).fillna(0)\n",
" return group\n",
"\n",
"\n",
"finance_df = finance_df.groupby(\"code\").apply(score_by_net_income_ascending).reset_index(drop=True)\n",
"finance_df[\"score_net_income_ascending\"] = np.sum(\n",
" [finance_df[\"score_net_income_1\"], finance_df[\"score_net_income_2\"], finance_df[\"score_net_income_3\"],\n",
" finance_df[\"score_net_income_4\"]], axis=0)\n",
"finance_df"
],
"id": "2d1ca7fc7873ce71",
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\lanyuanxiaoyao\\AppData\\Local\\Temp\\ipykernel_28824\\3206398043.py:22: DeprecationWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.\n",
" finance_df = finance_df.groupby(\"code\").apply(score_by_net_income_ascending).reset_index(drop=True)\n"
]
},
{
"data": {
"text/plain": [
" code year total_stockholder_interest net_income \\\n",
"0 0 0 0.000000e+00 0.000000e+00 \n",
"1 000001.SZ 2006 6.474463e+09 1.302907e+09 \n",
"2 000001.SZ 2007 1.300606e+10 2.649903e+09 \n",
"3 000001.SZ 2008 1.640079e+10 6.140350e+08 \n",
"4 000001.SZ 2009 2.046961e+10 5.030729e+09 \n",
"... ... ... ... ... \n",
"70725 871981.BJ 2019 1.484536e+08 1.933833e+07 \n",
"70726 871981.BJ 2020 1.963500e+08 3.113004e+07 \n",
"70727 871981.BJ 2021 4.901179e+08 6.549797e+07 \n",
"70728 871981.BJ 2022 5.238630e+08 4.359533e+07 \n",
"70729 871981.BJ 2023 5.044336e+08 -5.665176e+06 \n",
"\n",
" total_assets total_revenue inventories accounts_receivable \\\n",
"0 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 \n",
"1 2.605763e+11 7.135218e+09 NaN NaN \n",
"2 3.525394e+11 1.080750e+10 NaN NaN \n",
"3 4.744402e+11 1.451312e+10 NaN NaN \n",
"4 5.878110e+11 1.511444e+10 NaN NaN \n",
"... ... ... ... ... \n",
"70725 3.438787e+08 2.288707e+08 6.439933e+07 9.145441e+07 \n",
"70726 4.407381e+08 3.221584e+08 7.076865e+07 1.028702e+08 \n",
"70727 7.806398e+08 4.748939e+08 1.102946e+08 1.094223e+08 \n",
"70728 8.141886e+08 3.872667e+08 1.034502e+08 7.252802e+07 \n",
"70729 7.671820e+08 3.613133e+08 1.070212e+08 7.962033e+07 \n",
"\n",
" operating_costs operating_profit ... score_operating_safety_margin \\\n",
"0 0.000000e+00 0.000000e+00 ... 0 \n",
"1 NaN 1.905169e+09 ... 0 \n",
"2 NaN 3.721942e+09 ... 0 \n",
"3 NaN 8.034260e+08 ... 0 \n",
"4 NaN 6.159127e+09 ... 0 \n",
"... ... ... ... ... \n",
"70725 1.759984e+08 1.982603e+07 ... 0 \n",
"70726 2.444144e+08 3.466142e+07 ... 0 \n",
"70727 3.501988e+08 7.194507e+07 ... 0 \n",
"70728 3.141146e+08 2.827675e+07 ... 0 \n",
"70729 3.220560e+08 -1.192680e+07 ... 0 \n",
"\n",
" net_income_shift_1 score_net_income_1 net_income_shift_2 \\\n",
"0 0.000000e+00 0 NaN \n",
"1 3.110076e+08 30 NaN \n",
"2 1.302907e+09 30 0.000000e+00 \n",
"3 2.649903e+09 -30 1.302907e+09 \n",
"4 6.140350e+08 30 2.649903e+09 \n",
"... ... ... ... \n",
"70725 1.875294e+07 30 9.625220e+06 \n",
"70726 1.933833e+07 30 1.875294e+07 \n",
"70727 3.113004e+07 30 1.933833e+07 \n",
"70728 6.549797e+07 -30 3.113004e+07 \n",
"70729 4.359533e+07 -30 6.549797e+07 \n",
"\n",
" score_net_income_2 net_income_shift_3 score_net_income_3 \\\n",
"0 -25 NaN -20 \n",
"1 -25 NaN -20 \n",
"2 25 NaN -20 \n",
"3 25 0.000000e+00 20 \n",
"4 -25 1.302907e+09 20 \n",
"... ... ... ... \n",
"70725 25 0.000000e+00 20 \n",
"70726 25 9.625220e+06 20 \n",
"70727 25 1.875294e+07 20 \n",
"70728 25 1.933833e+07 20 \n",
"70729 -25 3.113004e+07 20 \n",
"\n",
" net_income_shift_4 score_net_income_4 score_net_income_ascending \n",
"0 NaN -15 -60 \n",
"1 NaN -15 -30 \n",
"2 NaN -15 20 \n",
"3 NaN -15 0 \n",
"4 0.00 15 40 \n",
"... ... ... ... \n",
"70725 NaN -15 60 \n",
"70726 0.00 15 90 \n",
"70727 9625220.11 15 90 \n",
"70728 18752944.38 15 30 \n",
"70729 19338331.78 15 -20 \n",
"\n",
"[70730 rows x 46 columns]"
],
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>code</th>\n",
" <th>year</th>\n",
" <th>total_stockholder_interest</th>\n",
" <th>net_income</th>\n",
" <th>total_assets</th>\n",
" <th>total_revenue</th>\n",
" <th>inventories</th>\n",
" <th>accounts_receivable</th>\n",
" <th>operating_costs</th>\n",
" <th>operating_profit</th>\n",
" <th>...</th>\n",
" <th>score_operating_safety_margin</th>\n",
" <th>net_income_shift_1</th>\n",
" <th>score_net_income_1</th>\n",
" <th>net_income_shift_2</th>\n",
" <th>score_net_income_2</th>\n",
" <th>net_income_shift_3</th>\n",
" <th>score_net_income_3</th>\n",
" <th>net_income_shift_4</th>\n",
" <th>score_net_income_4</th>\n",
" <th>score_net_income_ascending</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0.000000e+00</td>\n",
" <td>0.000000e+00</td>\n",
" <td>0.000000e+00</td>\n",
" <td>0.000000e+00</td>\n",
" <td>0.000000e+00</td>\n",
" <td>0.000000e+00</td>\n",
" <td>0.000000e+00</td>\n",
" <td>0.000000e+00</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0.000000e+00</td>\n",
" <td>0</td>\n",
" <td>NaN</td>\n",
" <td>-25</td>\n",
" <td>NaN</td>\n",
" <td>-20</td>\n",
" <td>NaN</td>\n",
" <td>-15</td>\n",
" <td>-60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>000001.SZ</td>\n",
" <td>2006</td>\n",
" <td>6.474463e+09</td>\n",
" <td>1.302907e+09</td>\n",
" <td>2.605763e+11</td>\n",
" <td>7.135218e+09</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>1.905169e+09</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>3.110076e+08</td>\n",
" <td>30</td>\n",
" <td>NaN</td>\n",
" <td>-25</td>\n",
" <td>NaN</td>\n",
" <td>-20</td>\n",
" <td>NaN</td>\n",
" <td>-15</td>\n",
" <td>-30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>000001.SZ</td>\n",
" <td>2007</td>\n",
" <td>1.300606e+10</td>\n",
" <td>2.649903e+09</td>\n",
" <td>3.525394e+11</td>\n",
" <td>1.080750e+10</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>3.721942e+09</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1.302907e+09</td>\n",
" <td>30</td>\n",
" <td>0.000000e+00</td>\n",
" <td>25</td>\n",
" <td>NaN</td>\n",
" <td>-20</td>\n",
" <td>NaN</td>\n",
" <td>-15</td>\n",
" <td>20</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>000001.SZ</td>\n",
" <td>2008</td>\n",
" <td>1.640079e+10</td>\n",
" <td>6.140350e+08</td>\n",
" <td>4.744402e+11</td>\n",
" <td>1.451312e+10</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>8.034260e+08</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>2.649903e+09</td>\n",
" <td>-30</td>\n",
" <td>1.302907e+09</td>\n",
" <td>25</td>\n",
" <td>0.000000e+00</td>\n",
" <td>20</td>\n",
" <td>NaN</td>\n",
" <td>-15</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>000001.SZ</td>\n",
" <td>2009</td>\n",
" <td>2.046961e+10</td>\n",
" <td>5.030729e+09</td>\n",
" <td>5.878110e+11</td>\n",
" <td>1.511444e+10</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>6.159127e+09</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>6.140350e+08</td>\n",
" <td>30</td>\n",
" <td>2.649903e+09</td>\n",
" <td>-25</td>\n",
" <td>1.302907e+09</td>\n",
" <td>20</td>\n",
" <td>0.00</td>\n",
" <td>15</td>\n",
" <td>40</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>70725</th>\n",
" <td>871981.BJ</td>\n",
" <td>2019</td>\n",
" <td>1.484536e+08</td>\n",
" <td>1.933833e+07</td>\n",
" <td>3.438787e+08</td>\n",
" <td>2.288707e+08</td>\n",
" <td>6.439933e+07</td>\n",
" <td>9.145441e+07</td>\n",
" <td>1.759984e+08</td>\n",
" <td>1.982603e+07</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1.875294e+07</td>\n",
" <td>30</td>\n",
" <td>9.625220e+06</td>\n",
" <td>25</td>\n",
" <td>0.000000e+00</td>\n",
" <td>20</td>\n",
" <td>NaN</td>\n",
" <td>-15</td>\n",
" <td>60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>70726</th>\n",
" <td>871981.BJ</td>\n",
" <td>2020</td>\n",
" <td>1.963500e+08</td>\n",
" <td>3.113004e+07</td>\n",
" <td>4.407381e+08</td>\n",
" <td>3.221584e+08</td>\n",
" <td>7.076865e+07</td>\n",
" <td>1.028702e+08</td>\n",
" <td>2.444144e+08</td>\n",
" <td>3.466142e+07</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1.933833e+07</td>\n",
" <td>30</td>\n",
" <td>1.875294e+07</td>\n",
" <td>25</td>\n",
" <td>9.625220e+06</td>\n",
" <td>20</td>\n",
" <td>0.00</td>\n",
" <td>15</td>\n",
" <td>90</td>\n",
" </tr>\n",
" <tr>\n",
" <th>70727</th>\n",
" <td>871981.BJ</td>\n",
" <td>2021</td>\n",
" <td>4.901179e+08</td>\n",
" <td>6.549797e+07</td>\n",
" <td>7.806398e+08</td>\n",
" <td>4.748939e+08</td>\n",
" <td>1.102946e+08</td>\n",
" <td>1.094223e+08</td>\n",
" <td>3.501988e+08</td>\n",
" <td>7.194507e+07</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>3.113004e+07</td>\n",
" <td>30</td>\n",
" <td>1.933833e+07</td>\n",
" <td>25</td>\n",
" <td>1.875294e+07</td>\n",
" <td>20</td>\n",
" <td>9625220.11</td>\n",
" <td>15</td>\n",
" <td>90</td>\n",
" </tr>\n",
" <tr>\n",
" <th>70728</th>\n",
" <td>871981.BJ</td>\n",
" <td>2022</td>\n",
" <td>5.238630e+08</td>\n",
" <td>4.359533e+07</td>\n",
" <td>8.141886e+08</td>\n",
" <td>3.872667e+08</td>\n",
" <td>1.034502e+08</td>\n",
" <td>7.252802e+07</td>\n",
" <td>3.141146e+08</td>\n",
" <td>2.827675e+07</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>6.549797e+07</td>\n",
" <td>-30</td>\n",
" <td>3.113004e+07</td>\n",
" <td>25</td>\n",
" <td>1.933833e+07</td>\n",
" <td>20</td>\n",
" <td>18752944.38</td>\n",
" <td>15</td>\n",
" <td>30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>70729</th>\n",
" <td>871981.BJ</td>\n",
" <td>2023</td>\n",
" <td>5.044336e+08</td>\n",
" <td>-5.665176e+06</td>\n",
" <td>7.671820e+08</td>\n",
" <td>3.613133e+08</td>\n",
" <td>1.070212e+08</td>\n",
" <td>7.962033e+07</td>\n",
" <td>3.220560e+08</td>\n",
" <td>-1.192680e+07</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>4.359533e+07</td>\n",
" <td>-30</td>\n",
" <td>6.549797e+07</td>\n",
" <td>-25</td>\n",
" <td>3.113004e+07</td>\n",
" <td>20</td>\n",
" <td>19338331.78</td>\n",
" <td>15</td>\n",
" <td>-20</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>70730 rows × 46 columns</p>\n",
"</div>"
]
},
"execution_count": 469,
"metadata": {},
"output_type": "execute_result"
}
],
"execution_count": 469
},
{
"metadata": {},
"cell_type": "code",
"source": [
"def score_by_net_income_ascending(code):\n",
" temp_df = finance_df[finance_df[\"code\"] == code].copy()[[\"year\", \"net_income\"]]\n",
" temp_df.sort_values(by=\"year\", ascending=False, inplace=True)\n",
" temp_df.set_index(keys=\"year\", drop=True, inplace=True)\n",
"\n",
" score = 0\n",
" if temp_df.iloc[0].values[0] > temp_df.iloc[1].values[0]:\n",
" score += 30\n",
" else:\n",
" score -= 30\n",
"\n",
" if temp_df.iloc[1].values[0] > temp_df.iloc[2].values[0]:\n",
" score += 25\n",
" else:\n",
" score -= 25\n",
"\n",
" if temp_df.iloc[2].values[0] > temp_df.iloc[3].values[0]:\n",
" score += 20\n",
" else:\n",
" score -= 20\n",
"\n",
" if temp_df.iloc[3].values[0] > temp_df.iloc[4].values[0]:\n",
" score += 15\n",
" else:\n",
" score -= 15\n",
"\n",
" return score\n",
"\n",
"\n",
"codes = list(map(lambda x: add_score(x, score_by_net_income_ascending(x[0])), codes))"
],
"id": "4ab6a7b485dc9349",
"outputs": [],
"execution_count": null
},
{
"metadata": {},
"cell_type": "code",
"source": [
"def score_by_operating_net_cash_flow_ascending(code):\n",
" temp_df = finance_df[finance_df[\"code\"] == code].copy()[[\"year\", \"operating_net_cash_flow\"]]\n",
" temp_df.sort_values(by=\"year\", ascending=False, inplace=True)\n",
" temp_df.set_index(keys=\"year\", drop=True, inplace=True)\n",
"\n",
" score = 0\n",
" if temp_df.iloc[0].values[0] > temp_df.iloc[1].values[0]:\n",
" score += 30\n",
" else:\n",
" score -= 30\n",
"\n",
" if temp_df.iloc[1].values[0] > temp_df.iloc[2].values[0]:\n",
" score += 25\n",
" else:\n",
" score -= 25\n",
"\n",
" if temp_df.iloc[2].values[0] > temp_df.iloc[3].values[0]:\n",
" score += 20\n",
" else:\n",
" score -= 20\n",
"\n",
" if temp_df.iloc[3].values[0] > temp_df.iloc[4].values[0]:\n",
" score += 15\n",
" else:\n",
" score -= 15\n",
"\n",
" return score\n",
"\n",
"\n",
"codes = list(map(lambda x: add_score(x, score_by_operating_net_cash_flow_ascending(x[0])), codes))"
],
"id": "d6644089e803a79d",
"outputs": [],
"execution_count": null
},
{
"metadata": {},
"cell_type": "code",
"source": [
"df = pd.DataFrame(\n",
" codes,\n",
" columns=[\"code\", \"name\", \"score\", \"roe_score\", \"roa_score\", \"net_income\", \"assets_turnover_and_cash\",\n",
" \"collection_cash_period_and_sales_period\", \"gross_profit_ratio_volatility\",\n",
" \"operating_safety_margin\", \"net_income_ascending\",\n",
" \"operating_net_cash_flow_ascending\"]\n",
")\n",
"df.sort_values(by=\"score\", ascending=False, inplace=True)\n",
"df"
],
"id": "ef9e6259efc1c1d0",
"outputs": [],
"execution_count": null
},
{
"metadata": {},
"cell_type": "code",
"source": "df[:100][\"code\"]",
"id": "32b5b4778985afaa",
"outputs": [],
"execution_count": null
},
{
"metadata": {},
"cell_type": "code",
"source": [
"temp_df = finance_df[finance_df[\"code\"] == \"600763.SH\"]\n",
"cal_roe(temp_df)\n",
"cal_roa(temp_df)\n",
"temp_df[[\"year\", \"roe\", \"roa\"]]"
],
"id": "e6d973f4ff98ebef",
"outputs": [],
"execution_count": null
},
{
"metadata": {},
"cell_type": "code",
"source": [
"temp_df = ts_pro.fina_indicator(\n",
" ts_code=\"600763.SH\", start_date=\"20140101\", end_date=\"20241231\",\n",
" fields=\"ts_code,end_date,roe,roa\"\n",
")\n",
"temp_df = temp_df[temp_df[\"end_date\"].str.endswith(\"1231\")]\n",
"# temp_df[\"end_date\"] = temp_df[\"end_date\"].str[:4]\n",
"# temp_df = temp_df.drop_duplicates(subset=[\"end_date\"], keep=\"last\")\n",
"temp_df"
],
"id": "b220252765677c76",
"outputs": [],
"execution_count": null
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}