qfeval_functions.functions.groupby
- groupby(x, group_id, dim=-1, empty_value=nan)[source]
Group tensor elements by group identifiers along a dimension.
This function reorganizes elements of a tensor into groups based on provided group identifiers. It creates a new dimension that contains all elements belonging to each group. This is similar to SQL’s GROUP BY or pandas’ groupby operation, but adapted for tensor operations.
The function adds a new dimension immediately after the specified dimension. This new dimension represents the items within each group. Since tensors require fixed-size dimensions, groups with fewer elements are padded with
empty_value.- Parameters:
x (
Tensor) – The input tensor to be grouped.group_id (
Tensor) – A 1D tensor of integer group identifiers with the same length asx.shape[dim]. Each element specifies which group the corresponding element inxbelongs to. Group IDs should be non-negative integers.dim (
int) – The dimension along which to group elements. Default is -1 (the last dimension).empty_value (
Any) – The value used to pad groups that have fewer elements than the maximum group size. Default isnan.
- Returns:
A tensor with one additional dimension compared to the input. The shape is the same as
xexcept at dimensiondim, which is replaced by two dimensions:(num_groups, max_group_size). Elements are rearranged according to their group membership.- Return type:
Example
>>> # Group a 1D tensor >>> x = torch.tensor([10., 20., 30., 40., 50.]) >>> group_id = torch.tensor([0, 1, 0, 1, 0]) >>> grouped = QF.groupby(x, group_id) >>> grouped tensor([[10., 30., 50.], [20., 40., nan]])
>>> # Group along a specific dimension of 2D tensor >>> x = torch.tensor([[1., 2., 3., 4.], ... [5., 6., 7., 8.]]) >>> group_id = torch.tensor([0, 1, 0, 2]) >>> grouped = QF.groupby(x, group_id, dim=1) >>> grouped tensor([[[1., 3.], [2., nan], [4., nan]], [[5., 7.], [6., nan], [8., nan]]])
>>> # Using custom empty value >>> x = torch.tensor([1, 2, 3, 4, 5], dtype=torch.int) >>> group_id = torch.tensor([0, 0, 1, 1, 1]) >>> grouped = QF.groupby(x, group_id, empty_value=-1) >>> grouped tensor([[ 1, 2, -1], [ 3, 4, 5]], dtype=torch.int32)