为什么会出现值错误?
首先,scipy.interpolate.interpn
要求interp_points.shape[-1]
与问题中的维数相同。这就是为什么你会得到一个ValueError
从你的代码片段——你的interp_points
有 92476 作为n_dims
,这与实际的 sim 数量 (4) 冲突。
快速解决
您只需更改操作顺序即可修复此代码片段。如果你挤压的话,你尝试挤压的太早了after插值:
points = (solzen, aod, index, wave) # 7, 5, 92476, 140
mg = np.meshgrid(solzen0, aod0, index0, wave0) # 4, 1, 1, 92476, 70
interp_points = np.moveaxis(mg, 0, -1) # 1, 1, 92476, 70, 4
result_presqueeze = interpn(points,
skyrad0, interp_points) # 1, 1, 92476, 70
result = np.squeeze(result_presqueeze,
axis=(0,1)) # 92476, 70
我已经更换了interp_mesh
with mg
在这里,并删除了np.array
(这不是必需的,因为np.meshgrid
返回一个ndarray
目的)。
性能评价
我认为你的代码片段很好,但是你可能希望使用xarray
如果您正在处理标记数据,则如下所示:
- 比无标签更具可读性
numpy
arrays
- 还可以使用处理一些后台工作dask https://dask.org/(如果您正在检查大量 6D 数据,则很有用)
Update: 哎呀!这本来应该是.interp
, not .loc
。下面的代码片段之所以有效,是因为数据点实际上是原始数据点。作为对其他人的警告:
from scipy.interpolate import interpn
import numpy as np
from xarray import DataArray
# Define the data space in the 4D skyrad0 array
solzen = np.arange(0,70,10) # 7
aod = np.arange(0,0.25,0.05) # 5
index = np.arange(1,92477,1) # 92476
wave = np.arange(350,1050,5) # 140
# Simulated skyrad for the values above
skyrad0 = np.random.rand(
solzen.size,aod.size,index.size,wave.size) # 7, 5, 92476, 140
# Data space for desired output values of skyrad
# with interpolation between input data space
solzen0 = 30 # 1
aod0 = 0.1 # 1
index0 = index # 92476
wave0 = np.arange(350,1050,10) # 70
def slow():
points = (solzen, aod, index, wave) # 7, 5, 92476, 140
mg = np.meshgrid(solzen0, aod0, index0, wave0) # 4, 1, 1, 92476, 70
interp_points = np.moveaxis(mg, 0, -1) # 1, 1, 92476, 70, 4
result_presqueeze = interpn(points,
skyrad0, interp_points) # 1, 1, 92476, 70
result = np.squeeze(result_presqueeze,
axis=(0,1)) # 92476, 70
return result
# This function uses .loc instead of .interp!
"""
def fast():
da = DataArray(name='skyrad0',
data=skyrad0,
dims=['solzen','aod','index','wave'],
coords=[solzen, aod, index, wave])
result = da.loc[solzen0, aod0, index0, wave0].squeeze()
return result
"""
通过对OP给出的更新代码片段进行一些修改:
import numpy as np
import xarray as xr
from scipy.interpolate import interpn
azimuth = np.arange(0, 185, 5) # 37
senzen = np.arange(0, 185, 5) # 37
#wave = np.arange(350,1050,5) # 140
wave = np.asarray([350, 360, 370, 380, 390, 410, 440, 470, 510,
550, 610, 670, 750, 865, 1040, 1240, 1640, 2250]) # 18
solzen = np.arange(0,65,5) # 13
aod = np.arange(0,0.55,0.05) # 11
wind = np.arange(0, 20, 5) # 4
coords = [azimuth, senzen, wave, solzen, aod, wind]
azimuth0 = 135 # 1
senzen0 = 140 # 1
wave0 = np.arange(350,1010,10) # 66
solzen0 = 30 # 1
aod0 = 0.1 # 1
wind0 = 10 # 1
interp_coords = [azimuth0, senzen0, wave0, solzen0, aod0, wind0]
# Simulated rad_boa
rad_boa = np.random.rand(
*map(lambda x: x.size, coords)) # 37, 37, 140/18, 13, 11, 4
def slow():
mg = np.meshgrid(*interp_coords)
interp_points = np.moveaxis(mg, 0, -1)
result_presqueeze = interpn(coords,
rad_boa, interp_points)
result = np.squeeze(result_presqueeze)
return result
def fast():
da = xr.DataArray(name='Radiance_BOA',
data=rad_boa,
dims=['azimuth','senzen','wave','solzen','aod','wind'],
coords=coords)
interp_dict = dict(zip(da.dims, interp_coords))
rad_inc_scaXR = da.interp(**interp_dict).squeeze()
return rad_inc_scaXR
这相当快:
>>> %timeit slow()
2.09 ms ± 85.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit fast()
343 ms ± 6.77 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> np.array_equal(slow(),fast())
True
您可以找到更多关于xarray
插值法here http://xarray.pydata.org/en/stable/interpolation.html。数据集实例具有非常相似的语法。
也可以根据需要更改插值方法(也许,人们可能希望提供关键字参数method='nearest'
to .interp
对于离散插值问题)。
更高级的东西
如果您希望实现更高级的东西,我建议也许使用 MARS(多元自适应回归样条)的实现之一。它介于标准回归和插值之间,适用于多维情况。在 Python 3 中,你最好的选择是pyearth https://github.com/scikit-learn-contrib/py-earth.