Data Analysis

Retrieve historical commodity prices, aggregate them into time series, run basic trend analysis, and export the results.

Problem

You need to analyze price history — compute moving averages, compare periods, spot trends, and hand the data to analysts or BI tools. Pulling one price at a time and reassembling series by hand is slow and error-prone. You want an efficient way to fetch a clean time series and turn it into insight.

Solution Architecture

A simple extract → transform → export pipeline:

OilPriceAPI /prices/historical ──> DataFrame ──> Aggregate/Analyze ──> Export
   (paginated history)              (in memory)   (resample, trends)   (CSV/JSON)

Historical retrieval pulls a date range in pages.
Aggregation resamples raw ticks into daily/weekly buckets.
Trend analysis computes moving averages and period-over-period change.
Export writes the result to CSV/JSON for downstream tools.

Code Implementation

Historical data retrieval

Request a date range from the historical endpoint and page through the results:

import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.oilpriceapi.com/v1"


def fetch_history(commodity, start, end):
    """Fetch all historical prices for a commodity across a date range."""
    rows, page = [], 1
    while True:
        resp = requests.get(
            f"{BASE_URL}/prices/historical",
            headers={"Authorization": f"Token {API_KEY}"},
            params={
                "commodity": commodity,
                "from": start,
                "to": end,
                "page": page,
            },
            timeout=30,
        )
        resp.raise_for_status()
        payload = resp.json()
        batch = payload.get("data", [])
        if not batch:
            break
        rows.extend(batch)
        page += 1
    return rows

Time series aggregation

Resample raw observations into regular intervals (e.g. daily averages) so series are comparable:

import pandas as pd

def to_daily_series(rows):
    df = pd.DataFrame(rows)
    df["created_at"] = pd.to_datetime(df["created_at"])
    df = df.set_index("created_at").sort_index()
    # Daily mean price; forward-fill gaps from non-trading days
    return df["price"].resample("1D").mean().ffill()

Trend analysis

Layer moving averages and percentage change on top of the series:

def add_trends(series):
    out = series.to_frame("price")
    out["sma_7"] = series.rolling(7).mean()    # 7-day moving average
    out["sma_30"] = series.rolling(30).mean()  # 30-day moving average
    out["pct_change_1d"] = series.pct_change() * 100
    return out

Export formats

Write results to whatever your downstream tools expect:

analysis = add_trends(to_daily_series(fetch_history("BRENT_CRUDE_USD",
                                                    "2024-01-01",
                                                    "2024-12-31")))

analysis.to_csv("brent_2024.csv")            # CSV for spreadsheets/BI
analysis.to_json("brent_2024.json", orient="records", date_format="iso")

Best Practices

Page through history instead of requesting one giant range; respect pagination so you do not miss rows.
Cache raw history locally (Parquet/CSV) so repeated analyses do not re-pull the same range from the API.
Resample before comparing so series from different commodities line up on the same time grid.
Annotate gaps (weekends, holidays) explicitly rather than silently interpolating across them.
Pin your date range and units in the export filename/metadata for reproducibility.

Common Pitfalls

Ignoring pagination. Stopping after the first page silently truncates your series and skews every statistic.
Mixing time zones. Parse timestamps as UTC and convert once; comparing naive and tz-aware timestamps raises errors or produces wrong buckets.
Interpolating across non-trading days without realizing it, which inflates sample counts and distorts moving averages.
Re-fetching on every run. Hitting the API for static historical data on each execution wastes quota — cache it.