Synergistic Carbon Trading and Power Generation Decision Considering the Annual Compliance Cycle and Market Response: A Hybrid Mathematical-Deep Reinforcement Learning Optimization Approach
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

This work is jointly supported by the Natural Science Foundation of China-Smart Grid Joint Fund of State Grid Corporation of China (No. U2066212); the National Natural Science Foundation of China (No. 52207105); and the Key Science and Technology Projects of China Southern Power Grid Corporation (No. 066600KK52222023).

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The annual compliance cycle of the carbon trading system allows generation companies (GenCos) to decouple the timing of carbon allowance purchases from their actual emissions. However, trading a large volume of allowances within a single day can significantly impact on carbon prices. Faced with uncertain future carbon and electricity prices, GenCos must address a challenging multistage stochastic optimization problem to coordinate their carbon trading strategies with daily power generation decisions. In this paper, a two-layered hybrid mathematical-deep reinforcement learning (DRL) optimization framework is proposed. The upper DRL layer tackles the stochastic, year-long carbon trading and allowance usage optimization problem, aiming for long-term optimality and providing guidance for short-term decisions in the lower layer. The lower mathematical optimization layer addresses the deterministic daily power generation schedule problem while en-forcing strict technical constraints. To accelerate learning of the annual compliance cycle, a decision timeline transfer learning method is proposed, enabling the DRL agent to progressively refine its policy through sequentially training on monthly, weekly and daily decision environments. Case studies demonstrate that, with these methods, a GenCo can reduce emission costs and increase profits by effectively leveraging carbon price fluctuations within the compliance cycle.

    Reference
    Related
    Cited by
Get Citation

Shouyuan Shi, Zhenning Pan, Member, IEEE, Junbin Chen, Tao Yu, Senior Member, IEEE. Synergistic Carbon Trading and Power Generation Decision Considering the Annual Compliance Cycle and Market Response: A Hybrid Mathematical-Deep Reinforcement Learning Optimization Approach[J]. Protection and Control of Modern Power Systems,2026,V11(01):173-191.[Shouyuan Shi, Zhenning Pan, Member, IEEE, Junbin Chen, Tao Yu, Senior Member, IEEE. Synergistic Carbon Trading and Power Generation Decision Considering the Annual Compliance Cycle and Market Response: A Hybrid Mathematical-Deep Reinforcement Learning Optimization Approach[J]. Power System Protection and Control,2026,V11(01):173-191]

Copy
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: January 05,2026
  • Published:
Article QR Code