APIリファレンス
メイン設定クラス
EstatDltConfig
e-Stat APIからdltへのデータ統合のメイン設定クラスです。ソースとロード先の設定を組み合わせ、データ抽出とロードのための追加処理オプションを提供します。
Bases: BaseModel
Main configuration for e-Stat API to DLT integration.
Combines source and destination configurations with additional processing options for data extraction and loading.
Attributes:
Name | Type | Description |
---|---|---|
source |
SourceConfig
|
e-Stat API source configuration. |
destination |
DestinationConfig
|
DLT destination configuration. |
batch_size |
Optional[int]
|
Number of records per batch. |
max_retries |
int
|
Maximum API retry attempts. |
timeout |
Optional[int]
|
API request timeout in seconds. |
Source code in src/estat_api_dlt_helper/config/models.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 |
|
SourceConfig
e-Stat APIデータソースの設定クラスです。認証、取得する統計表選択、各種オプションを含む統計データの取得パラメータを定義します。
パラメータの詳細はe_Stat API 仕様を参照のこと。
Bases: BaseModel
Configuration for e-Stat API data source.
Defines parameters for fetching statistical data from e-Stat API, including authentication, data selection, and pagination options.
Attributes:
Name | Type | Description |
---|---|---|
app_id |
str
|
e-Stat API application ID for authentication. |
statsDataId |
Union[str, List[str]]
|
Statistical table ID(s) to fetch. |
lang |
Literal['J', 'E']
|
Language for API response (J: Japanese, E: English). |
metaGetFlg |
Literal['Y', 'N']
|
Whether to fetch metadata. |
cntGetFlg |
Literal['Y', 'N']
|
Whether to fetch only record count. |
Source code in src/estat_api_dlt_helper/config/models.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 |
|
validate_stats_data_id(v)
classmethod
Ensure statsDataId is valid.
Source code in src/estat_api_dlt_helper/config/models.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 |
|
DestinationConfig
dlt destination (データロード先) の設定クラスです。ロード先のDWH、データセット/テーブル名、書き込み戦略を含むdltの設定を定義します。
Bases: BaseModel
Configuration for DLT data destination.
Defines parameters for loading data to various destinations using DLT, including destination type, dataset/table names, and write strategies.
Attributes:
Name | Type | Description |
---|---|---|
destination |
Union[str, Any]
|
DLT destination type or configuration object. |
dataset_name |
str
|
Target dataset/schema name. |
table_name |
str
|
Target table name. |
write_disposition |
Literal['append', 'replace', 'merge']
|
How to write data (append/replace/merge). |
primary_key |
Optional[Union[str, List[str]]]
|
Primary key columns for merge operations. |
Source code in src/estat_api_dlt_helper/config/models.py
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
|
validate_primary_key(v, info)
classmethod
Validate primary key is provided for merge operations.
Source code in src/estat_api_dlt_helper/config/models.py
130 131 132 133 134 135 136 137 138 139 140 141 |
|
APIクライアント
EstatApiClient
e-Stat APIアクセス用のクライアントクラスです。政府統計のe-Stat API機能から統計データを取得するメソッドを提供し、API認証、リクエストフォーマット、レスポンス解析を処理します。
Client for accessing e-Stat API.
Provides methods to fetch statistical data from Japan's e-Stat API. Handles API authentication, request formatting, and response parsing.
Attributes:
Name | Type | Description |
---|---|---|
app_id |
e-Stat API application ID for authentication. |
|
base_url |
Base URL for API endpoints. |
|
timeout |
Request timeout in seconds. |
|
session |
HTTP session for connection pooling. |
Source code in src/estat_api_dlt_helper/api/client.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
|
__init__(app_id, base_url=None, timeout=60)
Initialize e-Stat API client.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
app_id
|
str
|
e-Stat API application ID |
required |
base_url
|
Optional[str]
|
Base URL for API (defaults to official endpoint) |
None
|
timeout
|
int
|
Request timeout in seconds |
60
|
Source code in src/estat_api_dlt_helper/api/client.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
|
close()
Close the session.
Source code in src/estat_api_dlt_helper/api/client.py
192 193 194 |
|
get_stats_data(stats_data_id, start_position=1, limit=100000, meta_get_flg='Y', cnt_get_flg='N', explanation_get_flg='Y', annotation_get_flg='Y', replace_sp_chars='0', lang='J', **additional_params)
Get statistical data from e-Stat API.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stats_data_id
|
str
|
Statistical data ID |
required |
start_position
|
int
|
Start position for data retrieval (1-based) |
1
|
limit
|
int
|
Maximum number of records to retrieve |
100000
|
meta_get_flg
|
str
|
Whether to get metadata (Y/N) |
'Y'
|
cnt_get_flg
|
str
|
Whether to get count only (Y/N) |
'N'
|
explanation_get_flg
|
str
|
Whether to get explanations (Y/N) |
'Y'
|
annotation_get_flg
|
str
|
Whether to get annotations (Y/N) |
'Y'
|
replace_sp_chars
|
str
|
Replace special characters (0: No, 1: Yes, 2: Remove) |
'0'
|
lang
|
str
|
Language (J: Japanese, E: English) |
'J'
|
**additional_params
|
Any
|
Additional query parameters |
{}
|
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
API response as dictionary |
Source code in src/estat_api_dlt_helper/api/client.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
|
get_stats_data_generator(stats_data_id, limit_per_request=100000, **kwargs)
Get statistical data as a generator for pagination.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stats_data_id
|
str
|
Statistical data ID |
required |
limit_per_request
|
int
|
Number of records per request |
100000
|
**kwargs
|
Any
|
Additional parameters for get_stats_data |
{}
|
Yields:
Type | Description |
---|---|
Dict[str, Any]
|
Response data for each page |
Source code in src/estat_api_dlt_helper/api/client.py
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
|
get_stats_list(search_word=None, survey_years=None, stats_code=None, **kwargs)
Get list of available statistics.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
search_word
|
Optional[str]
|
Search keyword |
None
|
survey_years
|
Optional[str]
|
Survey years (YYYY or YYYYMM-YYYYMM) |
None
|
stats_code
|
Optional[str]
|
Statistics code |
None
|
**kwargs
|
Any
|
Additional query parameters |
{}
|
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
API response as dictionary |
Source code in src/estat_api_dlt_helper/api/client.py
160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 |
|
データ解析
parse_response
e-Stat APIレスポンスを解析してArrow形式に変換する関数です。JSONレスポンスを受け取り、データ値と関連メタデータを含む構造化されたArrowテーブルを返します。
Parse e-Stat API response data and convert to Arrow table.
This is the main entry point for parsing e-Stat API responses. Takes the JSON response and returns a structured Arrow table with data values and associated metadata.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Dict[str, Any]
|
The complete JSON response from e-Stat API |
required |
Returns:
Type | Description |
---|---|
Table
|
pa.Table: Arrow table containing the parsed data with metadata |
Raises:
Type | Description |
---|---|
ValueError
|
If required data sections are missing |
KeyError
|
If expected keys are not found in the response |
Source code in src/estat_api_dlt_helper/parser/response_parser.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
|
データローダー関数
load_estat_data
e-Stat APIデータを指定されたデスティネーションにロードする便利な関数です。提供された設定でdltパイプラインを作成して実行します。
Load e-Stat API data to the specified destination using DLT.
This is a convenience function that creates and runs a DLT pipeline with the provided configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config
|
EstatDltConfig
|
Configuration for e-Stat API source and DLT destination |
required |
credentials
|
Optional[Dict[str, Any]]
|
Optional credentials to override destination credentials |
None
|
**kwargs
|
Any
|
Additional arguments passed to pipeline.run() |
{}
|
Returns:
Type | Description |
---|---|
Any
|
LoadInfo object containing information about the load operation |
Example
from estat_api_dlt_helper import EstatDltConfig, load_estat_data
config = {
"source": {
"app_id": "YOUR_API_KEY",
"statsDataId": "0000020211",
"limit": 10
},
"destination": {
"destination": "duckdb",
"dataset_name": "demo",
"table_name": "demo",
"write_disposition": "merge",
"primary_key": ["time", "area", "cat01"]
}
}
config = EstatDltConfig(**config)
info = load_estat_data(config)
print(info)
Source code in src/estat_api_dlt_helper/loader/load_manager.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
|
create_estat_resource
e-Stat APIデータ用のdltリソースを作成する関数です。設定に基づいてe-Stat APIからデータを取得するカスタマイズ可能なdltリソースを作成します。
Create a DLT resource for e-Stat API data.
This function creates a customizable DLT resource that fetches data from the e-Stat API based on the provided configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config
|
EstatDltConfig
|
Configuration for e-Stat API source and destination |
required |
name
|
Optional[str]
|
Resource name (defaults to table_name from config) |
None
|
primary_key
|
Optional[Any]
|
Primary key columns (overrides config if provided) |
None
|
write_disposition
|
Optional[str]
|
Write disposition (overrides config if provided) |
None
|
columns
|
Optional[Any]
|
Column definitions for the resource |
None
|
table_format
|
Optional[str]
|
Table format for certain destinations |
None
|
file_format
|
Optional[str]
|
File format for filesystem destinations |
None
|
schema_contract
|
Optional[Any]
|
Schema contract settings |
None
|
table_name
|
Optional[Callable[[Any], str]]
|
Callable to generate dynamic table names |
None
|
max_table_nesting
|
Optional[int]
|
Maximum nesting level for nested data |
None
|
selected
|
Optional[bool]
|
Whether this resource is selected for loading |
None
|
merge_key
|
Optional[Any]
|
Merge key for merge operations |
None
|
parallelized
|
Optional[bool]
|
Whether to parallelize this resource |
None
|
**resource_kwargs
|
Any
|
Additional keyword arguments for dlt.resource |
{}
|
Returns:
Type | Description |
---|---|
Any
|
dlt.Resource: Configured DLT resource for e-Stat API data |
Example
from estat_api_dlt_helper import EstatDltConfig, create_estat_resource
config = EstatDltConfig(...)
resource = create_estat_resource(config)
# Customize the resource
resource = create_estat_resource(
config,
name="custom_stats",
columns={"time": {"data_type": "timestamp"}},
selected=True
)
Source code in src/estat_api_dlt_helper/loader/dlt_resource.py
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 |
|
create_estat_pipeline
e-Stat APIデータロード用のdltパイプラインを作成する関数です。提供された設定に基づいて指定されたデスティネーション用に構成されたカスタマイズ可能なdltパイプラインを作成します。
Create a DLT pipeline for e-Stat API data loading.
This function creates a customizable DLT pipeline configured for the specified destination based on the provided configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config
|
EstatDltConfig
|
Configuration for e-Stat API source and destination |
required |
pipeline_name
|
Optional[str]
|
Name of the pipeline (overrides config if provided) |
None
|
pipelines_dir
|
Optional[str]
|
Directory to store pipeline state |
None
|
dataset_name
|
Optional[str]
|
Dataset name in destination (overrides config if provided) |
None
|
import_schema_path
|
Optional[str]
|
Path to import schema from |
None
|
export_schema_path
|
Optional[str]
|
Path to export schema to |
None
|
dev_mode
|
Optional[bool]
|
Development mode (overrides config if provided) |
None
|
refresh
|
Optional[str]
|
Schema refresh mode |
None
|
progress
|
Optional[str]
|
Progress reporting configuration |
None
|
destination
|
Optional[Any]
|
DLT destination (constructed from config if not provided) |
None
|
staging
|
Optional[Any]
|
Staging destination for certain loaders |
None
|
**pipeline_kwargs
|
Any
|
Additional keyword arguments for dlt.pipeline |
{}
|
Returns:
Type | Description |
---|---|
Any
|
dlt.Pipeline: Configured DLT pipeline |
Example
from estat_api_dlt_helper import EstatDltConfig, create_estat_pipeline
config = EstatDltConfig(...)
pipeline = create_estat_pipeline(config)
# Customize the pipeline
pipeline = create_estat_pipeline(
config,
pipeline_name="custom_estat_pipeline",
dev_mode=True,
progress="log"
)
Source code in src/estat_api_dlt_helper/loader/dlt_pipeline.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
|