Background/Question/Methods Soil organic carbon is a determinant of multiple ecosystem services that soils provide to humanity. However, land use and climate change may alter the current soil carbon balance and convert the land surface into a source or sink of atmospheric CO2, altering soil properties and functions. More recently, use of machine learning (ML) approaches has increased in investigations of soil organic carbon (SOC) storage and dynamics. We used ML approaches with large number of soil filed observations and environmental factors data to (1) predict the spatial heterogeneity of northern circumpolar surface SOC stocks, (2) develop scaling functions of soil organic carbon, and (3) develop functional relationships between environmental factors and SOC stocks to benchmark model representations. Results/Conclusions Our results suggest that ensemble ML approach improves prediction accuracy of surface SOC stocks in comparison to other approaches. Both mean and variance of SOC stocks decreased linearly with spatial scale in continental US. The functional relationships between environmental controllers and SOC stocks that we derived, produced similar prediction accuracy as obtained from the random forest ML approach. Observations and models showed divergent control of climatic factors on SOC stocks. In summary, ML approaches can help in (1) quantifying anthropogenic and climatic impacts on SOC, and (2) reducing the uncertainty that exists in model projections of future carbon climate feedbacks.