Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

anndata dataframes and matrices #14

Open
GreenGilad opened this issue Feb 17, 2022 · 1 comment
Open

anndata dataframes and matrices #14

GreenGilad opened this issue Feb 17, 2022 · 1 comment

Comments

@GreenGilad
Copy link

GreenGilad commented Feb 17, 2022

Hi there,

When storing a matrix with row names and column names in the uns slot, these are removed. I assume that is to align with the python numpy implementation where these are not supported.

To try and work around this problem of loosing the row and column names I can convert the R matrix to a R dataframe. However, when doing so, the anndata object stores these as a pandas DataFrame. I wanted to ask why does the R anndata object stores R dataframe as a pandas DataFrame instead in the R format? Couldn't this be kept transparent to the user only for reading and writing the h5ad object to file but then once loaded to have the class of R dataframe? Currently, every time I wish to use such a dataframe I must use reticulate::py_to_r and I still loose row and column names when doing so.

Couldn't it be the same as anndata$X?

Related to this issue is the case that the matrix contains character values. In this case I am not able to nicely obtain the matrix with the names and in a proper matrix shape. I get it as a flat matrix even if I try to reshape it.

The scenario I am working on is of a square symmetric correlation matrix with the p-values, multiple hypothesis testing corrections matrix and the asterisks matrix.

data$uns$ss.cor <- list(
  names = colnames(data$X),
  corr = stats::cor(data$X, use = "pairwise.complete.obs", method = "spearman"),
  pval = outer(1:ncol(data$X), 1:ncol(data$X), Vectorize(function(i,j)
    cor.test(data$X[,i], data$X[,j], use="pairwise.complete.obs", method = "spearman")[["p.value"]]))
)
data$uns$ss.cor$adj.pval <- matrix(p.adjust(data$uns$ss.cor$pval, method = "BH"), nrow=nrow(data$uns$ss.cor$pval))
data$uns$ss.cor$sig <- matrix(cut(data$uns$ss.cor$adj.pval, c(-.1, 0.001, 0.01, 0.05, Inf), c("***", "**", "*", "")), nrow=nrow(data$uns$ss.cor$pval))
data$uns$ss.cor$params <- list(cor.method = "spearman",
                               cor.use = "pairwise.complete.obs",
                               p.adjust.method = "BH")

To keep it simple I am showing the above using data$X but in reality I am using a matrix of different shape than X and therefore using uns and not varp.

Thanks!

@rcannood
Copy link
Member

rcannood commented Mar 7, 2022

Hey Gilad!

Could you provide me with a small reproducible example?

I'll think about whether I can find a "nice" solution to your problem while still being compatible with the standard anndata interface.

Robrecht

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants