Data Analysis
Retrieve historical commodity prices, aggregate them into time series, run basic trend analysis, and export the results.
Problem
You need to analyze price history — compute moving averages, compare periods, spot trends, and hand the data to analysts or BI tools. Pulling one price at a time and reassembling series by hand is slow and error-prone. You want an efficient way to fetch a clean time series and turn it into insight.
Solution Architecture
A simple extract → transform → export pipeline:
OilPriceAPI /prices/historical ──> DataFrame ──> Aggregate/Analyze ──> Export
(paginated history) (in memory) (resample, trends) (CSV/JSON)
- Historical retrieval pulls a date range in pages.
- Aggregation resamples raw ticks into daily/weekly buckets.
- Trend analysis computes moving averages and period-over-period change.
- Export writes the result to CSV/JSON for downstream tools.
Code Implementation
Historical data retrieval
Request a date range from the historical endpoint and page through the results:
import requests
API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.oilpriceapi.com/v1"
def fetch_history(commodity, start, end):
"""Fetch all historical prices for a commodity across a date range."""
rows, page = [], 1
while True:
resp = requests.get(
f"{BASE_URL}/prices/historical",
headers={"Authorization": f"Token {API_KEY}"},
params={
"commodity": commodity,
"from": start,
"to": end,
"page": page,
},
timeout=30,
)
resp.raise_for_status()
payload = resp.json()
batch = payload.get("data", [])
if not batch:
break
rows.extend(batch)
page += 1
return rows
Time series aggregation
Resample raw observations into regular intervals (e.g. daily averages) so series are comparable:
import pandas as pd
def to_daily_series(rows):
df = pd.DataFrame(rows)
df["created_at"] = pd.to_datetime(df["created_at"])
df = df.set_index("created_at").sort_index()
# Daily mean price; forward-fill gaps from non-trading days
return df["price"].resample("1D").mean().ffill()
Trend analysis
Layer moving averages and percentage change on top of the series:
def add_trends(series):
out = series.to_frame("price")
out["sma_7"] = series.rolling(7).mean() # 7-day moving average
out["sma_30"] = series.rolling(30).mean() # 30-day moving average
out["pct_change_1d"] = series.pct_change() * 100
return out
Export formats
Write results to whatever your downstream tools expect:
analysis = add_trends(to_daily_series(fetch_history("BRENT_CRUDE_USD",
"2024-01-01",
"2024-12-31")))
analysis.to_csv("brent_2024.csv") # CSV for spreadsheets/BI
analysis.to_json("brent_2024.json", orient="records", date_format="iso")
Best Practices
- Page through history instead of requesting one giant range; respect pagination so you do not miss rows.
- Cache raw history locally (Parquet/CSV) so repeated analyses do not re-pull the same range from the API.
- Resample before comparing so series from different commodities line up on the same time grid.
- Annotate gaps (weekends, holidays) explicitly rather than silently interpolating across them.
- Pin your date range and units in the export filename/metadata for reproducibility.
Common Pitfalls
- Ignoring pagination. Stopping after the first page silently truncates your series and skews every statistic.
- Mixing time zones. Parse timestamps as UTC and convert once; comparing naive and tz-aware timestamps raises errors or produces wrong buckets.
- Interpolating across non-trading days without realizing it, which inflates sample counts and distorts moving averages.
- Re-fetching on every run. Hitting the API for static historical data on each execution wastes quota — cache it.