Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for AnnData/H5AD files #1

Open
mojaveazure opened this issue May 15, 2020 · 28 comments
Open

Support for AnnData/H5AD files #1

mojaveazure opened this issue May 15, 2020 · 28 comments
Labels
enhancement New feature or request

Comments

@mojaveazure
Copy link
Owner

Tracker for bugs in the h5Seurat/H5AD converter. Please note:

  • All support for reading and writing H5AD files is done through the h5Seurat intermediate. There is no direct Seurat object/H5AD saving and loading
  • There is no support for H5T_COMPOUND datasets found in the obs, var, obsm, and varm slots older AnnData objects. Modern AnnData objects use HDF5 groups, which are supported in SeuratDisk
@sherifgerges
Copy link

Hello, thank you so much for putting this up. I am trying to install, I get the following error. Any ideas as to whats wrong? Thanks so much

✓ checking for file ‘/private/var/folders/rf/yddlf5ss53968h_zpbkfvkkx1vsnpd/T/RtmpxoZWGw/remotescc9d622d28c/mojaveazure-seurat-disk-007a931/DESCRIPTION’ ...
─ preparing ‘SeuratDisk’:
✓ checking DESCRIPTION meta-information ...
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ building ‘SeuratDisk_0.0.0.9009.tar.gz’
Warning: invalid uid value replaced by that for user 'nobody'
Warning: invalid gid value replaced by that for user 'nobody'

  • installing source package ‘SeuratDisk’ ...
    ** using staged installation
    ** R
    ** byte-compile and prepare package for lazy loading
    Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) :
    namespace ‘methods’ 3.6.0 is already loaded, but >= 3.6.1 is required
    Calls: ... namespaceImportFrom -> asNamespace -> loadNamespace
    Execution halted
    ERROR: lazy loading failed for package ‘SeuratDisk’
  • removing ‘/Library/Frameworks/R.framework/Versions/3.6/Resources/library/SeuratDisk’
    Error: Failed to install 'SeuratDisk' from GitHub:
    (converted from warning) installation of package ‘/var/folders/rf/yddlf5ss53968h_zpbkfvkkx1vsnpd/T//RtmpxoZWGw/filecc9d6c528d28/SeuratDisk_0.0.0.9009.tar.gz’ had non-zero exit status

@ohne416
Copy link

ohne416 commented Sep 12, 2020

I tried to convert from a HD5ad file to h5seurat using convert function in seurat-disk package, but it failed with this error message "Error: Cannot find feature names in this H5AD file"
THe H5ad files were downloaded from. https://www.covid19cellatlas.org/
Can you help?

@jlu360a
Copy link

jlu360a commented Nov 19, 2020

Hi,

First thanks for developing this nice tool. It has been very helpful. I have a question here. I am not sure if this is expected.
I am trying to read a h5ad file. The source is here:

Source: https://cellxgene.cziscience.com/
DataSet: "Krasnow Lab Human Lung Cell Atlas, 10X"

The h5ad file is around 735MB. I was successful in converting to 'h5seurat' format (with file size around 1.04GB). When I try to load with "LoadH5Seurat", I am kind of surprised that it takes around 30GB memory on my linux computer (Centos 7). Is this expected?
Here is the version info:

  • R 4.0.3
  • SeuratDisk_0.0.0.9013
  • Seurat_3.2.2

Thanks.

@Isabelle-C
Copy link

Hi,

Thank you for developing the tool! I was able to convert my h5ad file to h5seurat. However, when I am reading the h5seurat file, the following error was resulted:

test <- LoadH5Seurat(file = 'myfilename.h5seurat')
Validating h5Seurat file
Initializing RNA with data
Adding counts for RNA
Adding scale.data for RNA
Adding feature-level metadata for RNA
Initializing scaled with data
Error in dimnames(x) <- dn :
length of 'dimnames' [1] not equal to array extent

I tried on multiple files and they all result in the same error. Would you please let me know what went wrong? Thank you so much!

@seigfried
Copy link

pbmc3k <- LoadH5Seurat("NPC_All_4_Labelled.h5seurat")
Validating h5Seurat file
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing RNA with data
Adding counts for RNA
Adding scale.data for RNA
Adding feature-level metadata for RNA
Adding reduction pca
Adding cell embeddings for pca
Adding feature loadings for pca
Adding miscellaneous information for pca
Adding reduction umap
Adding cell embeddings for umap
Adding miscellaneous information for umap
Adding command information
Adding cell-level metadata
Adding miscellaneous information
Error in if (!x[[i]]$dims) { : argument is of length zero

Not sure whether the first warning comes from. The h5Seurat object step works fine and then fails in the second step.

@hbandukw
Copy link

hbandukw commented Feb 22, 2021

Hello,

I was able to successfully convert my integrated assay (with SCT used for normalization) into h5ad but I am unable to read it into scanpy.

Scanpy:

adata = sc.read_h5ad(Seurat_h5ad_path)

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-43-9928f0c89d25> in <module>
----> 1 adata = sc.read_h5ad(Seurat_h5ad_path)

~/opt/anaconda3/envs/scenic_protocol/lib/python3.6/site-packages/anndata/_io/h5ad.py in read_h5ad(filename, backed, as_sparse, as_sparse_fmt, chunk_size)
    440     _clean_uns(d)  # backwards compat
    441 
--> 442     return AnnData(**d)
    443 
    444 

TypeError: __init__() got an unexpected keyword argument 'active.ident'
  • Seurat:
  • R version 4.0.2 (2020-06-22)
  • Seurat_4.0.0
  • SeuratDisk_0.0.0.9018

Output:

Creating h5Seurat file for version 3.1.5.9900
Adding counts for RNA
Adding data for RNA
No variable features found for RNA
No feature-level metadata found for RNA
Adding counts for SCT
Adding data for SCT
Adding scale.data for SCT
No variable features found for SCT
No feature-level metadata found for SCT
Writing out SCTModel.list for SCT
Adding data for integrated
Adding scale.data for integrated
Adding variable features for integrated
No feature-level metadata found for integrated
Writing out SCTModel.list for integrated
Adding cell embeddings for pca
Adding loadings for pca
No projected loadings for pca
Adding standard deviations for pca
No JackStraw data for pca
Adding cell embeddings for umap
No loadings for umap
No projected loadings for umap
No standard deviations for umap
No JackStraw data for umap
Validating h5Seurat file
Adding scale.data from integrated as X
Adding data from integrated as raw
Transfering meta.data to obs
Adding dimensional reduction information for pca
Adding feature loadings for pca
Adding dimensional reduction information for umap
Adding integrated_snn as neighbors

@lavon79
Copy link

lavon79 commented May 17, 2021

Hi,

Thank you for developing the tool! I was able to convert my h5ad file to h5seurat. However, when I am reading the h5seurat file, the following error was resulted:

test <- LoadH5Seurat(file = 'myfilename.h5seurat')
Validating h5Seurat file
Initializing RNA with data
Adding counts for RNA
Adding scale.data for RNA
Adding feature-level metadata for RNA
Initializing scaled with data
Error in dimnames(x) <- dn :
length of 'dimnames' [1] not equal to array extent

I tried on multiple files and they all result in the same error. Would you please let me know what went wrong? Thank you so much!

Have you sloved this issue? i meet the same error

@dfernandezperez
Copy link

I have also the same error with any Seurat object I try to convert.

  • Seurat 4.0
  • R 4.0.3
  • SeuratDisk_0.0.0.9019

@fly4all
Copy link

fly4all commented Jun 25, 2021

I was able to convert an .h5ad file from this dataset into .h5seurat, but I can't seem to load the file.

Upon running seuratObject <- LoadH5Seurat("~/Downloads/GSE161228_24h_PN_all.h5seurat")

I get the following error:
Validating h5Seurat file
Initializing RNA with data
Adding counts for RNA
Adding scale.data for RNA
Adding feature-level metadata for RNA
Adding reduction pca
Adding cell embeddings for pca
Adding feature loadings for pca
Adding miscellaneous information for pca
Adding reduction tsne
Adding cell embeddings for tsne
Adding miscellaneous information for tsne
Adding command information
Adding cell-level metadata
Error: Too many values for levels provided

Do you have any advice on resolving this?

@davidroad
Copy link

pbmc3k <- LoadH5Seurat("NPC_All_4_Labelled.h5seurat")
Validating h5Seurat file
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing RNA with data
Adding counts for RNA
Adding scale.data for RNA
Adding feature-level metadata for RNA
Adding reduction pca
Adding cell embeddings for pca
Adding feature loadings for pca
Adding miscellaneous information for pca
Adding reduction umap
Adding cell embeddings for umap
Adding miscellaneous information for umap
Adding command information
Adding cell-level metadata
Adding miscellaneous information
Error in if (!x[[i]]$dims) { : argument is of length zero

Not sure whether the first warning comes from. The h5Seurat object step works fine and then fails in the second step.

pbmc3k <- LoadH5Seurat("pbmc3k_final.h5seurat",array = "RNA") will work

@jmitchell81
Copy link

pbmc3k <- LoadH5Seurat("NPC_All_4_Labelled.h5seurat")
Validating h5Seurat file
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing RNA with data
Adding counts for RNA
Adding scale.data for RNA
Adding feature-level metadata for RNA
Adding reduction pca
Adding cell embeddings for pca
Adding feature loadings for pca
Adding miscellaneous information for pca
Adding reduction umap
Adding cell embeddings for umap
Adding miscellaneous information for umap
Adding command information
Adding cell-level metadata
Adding miscellaneous information
Error in if (!x[[i]]$dims) { : argument is of length zero

Not sure whether the first warning comes from. The h5Seurat object step works fine and then fails in the second step.

pbmc3k <- LoadH5Seurat("pbmc3k_final.h5seurat",array = "RNA") will work

This also worked for me, but use
assays = "RNA"
instead of
array = "RNA"

@GouQiao
Copy link

GouQiao commented Dec 20, 2021

Hi , I want to used convert function to convert h5ad to seurat.
But I met the following error:

Error in self$write_low_level(value, file_space = self_space_id, mem_space = mem_space_id, :
Number of objects in robj is not the same and not a multiple of number of elements selected in file: expected are 0 but provided are 3000

Is there anyone knows how to solve?

@divyanshusrivastava
Copy link

Hi. I am also facing issues while reading the (successfully converted) H5Seurat file. Here is the traceback

Validating h5Seurat file

Initializing RNA with data

Error in sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], : all(dims >= dims.min) is not TRUE
Traceback:

  1. LoadH5Seurat("temp_pdx_adata.h5seurat")
  2. LoadH5Seurat.character("temp_pdx_adata.h5seurat")
  3. LoadH5Seurat(file = hfile, assays = assays, reductions = reductions,
    . graphs = graphs, neighbors = neighbors, images = images,
    . meta.data = meta.data, commands = commands, misc = misc,
    . tools = tools, verbose = verbose, ...)
  4. LoadH5Seurat.h5Seurat(file = hfile, assays = assays, reductions = reductions,
    . graphs = graphs, neighbors = neighbors, images = images,
    . meta.data = meta.data, commands = commands, misc = misc,
    . tools = tools, verbose = verbose, ...)
  5. as.Seurat(x = file, assays = assays, reductions = reductions,
    . graphs = graphs, neighbors = neighbors, images = images,
    . meta.data = meta.data, commands = commands, misc = misc,
    . tools = tools, verbose = verbose, ...)
  6. as.Seurat.h5Seurat(x = file, assays = assays, reductions = reductions,
    . graphs = graphs, neighbors = neighbors, images = images,
    . meta.data = meta.data, commands = commands, misc = misc,
    . tools = tools, verbose = verbose, ...)
  7. AssembleAssay(assay = assay, file = x, slots = assays[[assay]],
    . verbose = verbose)
  8. as.matrix(x = assay.group[["data"]])
  9. as.matrix.H5Group(x = assay.group[["data"]])
  10. as.sparse(x = x, ...)
  11. as.sparse.H5Group(x = x, ...)
  12. sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][],
    . dims = h5attr(x = x, which = "dims"))
  13. stopifnot(all(dims >= dims.min))

@giorgiatosoni
Copy link

Hi,
Thank you for developing the tool! I was able to convert my h5ad file to h5seurat. However, when I am reading the h5seurat file, the following error was resulted:
test <- LoadH5Seurat(file = 'myfilename.h5seurat')
Validating h5Seurat file
Initializing RNA with data
Adding counts for RNA
Adding scale.data for RNA
Adding feature-level metadata for RNA
Initializing scaled with data
Error in dimnames(x) <- dn :
length of 'dimnames' [1] not equal to array extent
I tried on multiple files and they all result in the same error. Would you please let me know what went wrong? Thank you so much!

Have you sloved this issue? i meet the same error

Hi, did someone solve this issue??

@ishwarvh
Copy link

Hello,

I was able to successfully convert my integrated assay (with SCT used for normalization) into h5ad but I am unable to read it into scanpy.

Scanpy:

adata = sc.read_h5ad(Seurat_h5ad_path)

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-43-9928f0c89d25> in <module>
----> 1 adata = sc.read_h5ad(Seurat_h5ad_path)

~/opt/anaconda3/envs/scenic_protocol/lib/python3.6/site-packages/anndata/_io/h5ad.py in read_h5ad(filename, backed, as_sparse, as_sparse_fmt, chunk_size)
    440     _clean_uns(d)  # backwards compat
    441 
--> 442     return AnnData(**d)
    443 
    444 

TypeError: __init__() got an unexpected keyword argument 'active.ident'
  • Seurat:
  • R version 4.0.2 (2020-06-22)
  • Seurat_4.0.0
  • SeuratDisk_0.0.0.9018

Output:

Creating h5Seurat file for version 3.1.5.9900
Adding counts for RNA
Adding data for RNA
No variable features found for RNA
No feature-level metadata found for RNA
Adding counts for SCT
Adding data for SCT
Adding scale.data for SCT
No variable features found for SCT
No feature-level metadata found for SCT
Writing out SCTModel.list for SCT
Adding data for integrated
Adding scale.data for integrated
Adding variable features for integrated
No feature-level metadata found for integrated
Writing out SCTModel.list for integrated
Adding cell embeddings for pca
Adding loadings for pca
No projected loadings for pca
Adding standard deviations for pca
No JackStraw data for pca
Adding cell embeddings for umap
No loadings for umap
No projected loadings for umap
No standard deviations for umap
No JackStraw data for umap
Validating h5Seurat file
Adding scale.data from integrated as X
Adding data from integrated as raw
Transfering meta.data to obs
Adding dimensional reduction information for pca
Adding feature loadings for pca
Adding dimensional reduction information for umap
Adding integrated_snn as neighbors

Hello, Were you able to figure out issue here?

@peralesvilchezl
Copy link

Hi,

First thanks for developing this nice tool. It has been very helpful. I have a question here. I am not sure if this is expected. I am trying to read a h5ad file. The source is here:

Source: https://cellxgene.cziscience.com/ DataSet: "Krasnow Lab Human Lung Cell Atlas, 10X"

The h5ad file is around 735MB. I was successful in converting to 'h5seurat' format (with file size around 1.04GB). When I try to load with "LoadH5Seurat", I am kind of surprised that it takes around 30GB memory on my linux computer (Centos 7). Is this expected? Here is the version info:

  • R 4.0.3
  • SeuratDisk_0.0.0.9013
  • Seurat_3.2.2

Thanks.

Hey I have the same problem!!

@xiao-kong-long
Copy link

Hi, I wonder know how to process the spatial information of 10x Visium data, follows are my code :

In R :

seurat.object = Load10X_Spatial(data.dir = h5.dir, filename = filename)
SaveH5Seurat(seurat.object, filename = data.h5seurat.url)
Convert(data.h5seurat.url, dest = 'h5ad')

In Python :
adata = sc.read(input_dir + '/test.h5ad')

transformed adata loses so much information, espacially for spatial position and image.
I don't know any solution of this.

@xiao-kong-long
Copy link

#67

@DanielMedic
Copy link

pbmc3k <- LoadH5Seurat("NPC_All_4_Labelled.h5seurat")
Validating h5Seurat file
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing RNA with data
Adding counts for RNA
Adding scale.data for RNA
Adding feature-level metadata for RNA
Adding reduction pca
Adding cell embeddings for pca
Adding feature loadings for pca
Adding miscellaneous information for pca
Adding reduction umap
Adding cell embeddings for umap
Adding miscellaneous information for umap
Adding command information
Adding cell-level metadata
Adding miscellaneous information
Error in if (!x[[i]]$dims) { : argument is of length zero

Not sure whether the first warning comes from. The h5Seurat object step works fine and then fails in the second step.

I'm having the exact same error.

@maxjcarlino
Copy link

Hello,
Thank you for developing this tool! It is extremely valuable to my research in working with multiple collaborators.
I was able to successfully convert an H5 Seurat object to an H5ad object, however for some reason I only obtain the top 2000 variable features in my converted object. The H5 Seurat object still contains all features, so it seems I am missing something in the Convert function. Could you help me figure out what I am missing to export all features instead of only the variable features?

LoadH5Seurat(paste(dataDir, "ssEpcam.h5Seurat",sep = ""))
Validating h5Seurat file
Initializing RNA with data
Adding counts for RNA
Adding scale.data for RNA
Adding feature-level metadata for RNA
Adding variable feature information for RNA
Adding miscellaneous information for RNA
Initializing prediction.score.State with data
Adding counts for prediction.score.State
Adding miscellaneous information for prediction.score.State
Warning: Keys should be one or more alphanumeric characters followed by an underscore, setting key from prediction.score.State_ to predictionscoreState_
Adding reduction pca
Adding cell embeddings for pca
Adding feature loadings for pca
Adding miscellaneous information for pca
Adding reduction umap
Adding cell embeddings for umap
Adding miscellaneous information for umap
Adding graph RNA_nn
Adding graph RNA_snn
Adding command information
Adding cell-level metadata
Adding miscellaneous information
Adding tool-specific results
An object of class Seurat
24448 features across 12089 samples within 2 assays
Active assay: RNA (24393 features, 2000 variable features)
1 other assay present: prediction.score.State
2 dimensional reductions calculated: pca, umap
Convert(paste(dataDir, "ssEpcam.h5Seurat",sep = ""), dest = "h5ad", assay = "RNA", overwrite=TRUE)
Validating h5Seurat file
Adding scale.data from RNA as X
Transfering meta.features to var
Adding data from RNA as raw
Transfering meta.features to raw/var
Transfering meta.data to obs
Adding dimensional reduction information for pca
Adding feature loadings for pca
Adding dimensional reduction information for umap
Adding RNA_snn as neighbors

And when I load the h5ad into scanpy I get:

anndata = scanpy.read_h5ad(os.path.join(chdir, 'data/ssEpcam.h5ad'))
anndata
AnnData object with n_obs × n_vars = 12089 × 2000
obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'type', 'stage', 'embryo', 'percent.mt', 'RNA_snn_res.0.8', 'seurat_clusters', 'RNA_snn_res.0.5', 'RNA_snn_res.0.25', 'BC', 'sex', 'S.Score', 'G2M.Score', 'Phase', 'Kernel', 'predicted.State.score', 'State', 'CellCluster', 'nCount_prediction.score.State', 'nFeature_prediction.score.State', 'RNA_snn_res.0.2'
var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable'
uns: 'neighbors'
obsm: 'X_pca', 'X_umap'
varm: 'PCs'
obsp: 'distances'

Thank you in advance for your help!!

@1098255342
Copy link

Have you sloved this issue? i meet the same error

@maxjcarlino
Copy link

Have you sloved this issue? i meet the same error

Yes, the answer was that the convert function was trying to pull the scaled data and there was no argument we could find in the convert function that would change where it was pulling from, so it was always pulling the scaled 2000 variable genes only:

Adding scale.data from RNA as X
Transfering meta.features to var
Adding data from RNA as raw

The solution that worked was to rescale the object in R, which rescales all genes:

data <- ScaleData(data, features = rownames(data))

Once I did that then saved as H5 and converted, it exported the full gene list

@DM0815
Copy link

DM0815 commented Jun 21, 2023

Have you sloved this issue? i meet the same error

Yes, the answer was that the convert function was trying to pull the scaled data and there was no argument we could find in the convert function that would change where it was pulling from, so it was always pulling the scaled 2000 variable genes only:

Adding scale.data from RNA as X
Transfering meta.features to var
Adding data from RNA as raw

The solution that worked was to rescale the object in R, which rescales all genes:

data <- ScaleData(data, features = rownames(data))

Once I did that then saved as H5 and converted, it exported the full gene list

how do I change your code 'data <- ScaleData(data, features = rownames(data))', if my seuratobject name is s.

@1098255342
Copy link

1098255342 commented Jun 21, 2023 via email

@maxjcarlino
Copy link

Have you sloved this issue? i meet the same error

Yes, the answer was that the convert function was trying to pull the scaled data and there was no argument we could find in the convert function that would change where it was pulling from, so it was always pulling the scaled 2000 variable genes only:

Adding scale.data from RNA as X
Transfering meta.features to var
Adding data from RNA as raw

The solution that worked was to rescale the object in R, which rescales all genes:

data <- ScaleData(data, features = rownames(data))

Once I did that then saved as H5 and converted, it exported the full gene list

how do I change your code 'data <- ScaleData(data, features = rownames(data))', if my seuratobject name is s.

just replace "data" with your seurat object name as labeled when you load it into the R workspace

@Tingtingyang1234
Copy link

hello ,when i covert my h5ad file to h5seurat, there is an error :
Warning: Unknown file type: h5ad
Creating h5Seurat file for version 3.1.5.9900
Adding X as scale.data
Adding raw/X as data
Adding raw/X as counts
Adding meta.features from raw/var
Merging dispersions from scaled feature-level metadata
Merging dispersions_norm from scaled feature-level metadata
Merging feature_types from scaled feature-level metadata
Error in source[["var"]][[mf]]$read() : 不适用于非函数

Can you help me?

@karlie002
Copy link

Hi, when I tried to use LoadSeurat function for a .h5ad object , I got the error messages like :
Validating h5Seurat file
Initializing RNA with data
Error in sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], :
'dims' must contain all (i,j) pairs
Has anyone met the problem?
Any suggestions will be appreciated !

@madeofrats
Copy link

Same here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests