Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_xml_num: Lacks Example and Detailed Explanation #415

Open
discoleo opened this issue Nov 3, 2023 · 0 comments
Open

read_xml_num: Lacks Example and Detailed Explanation #415

discoleo opened this issue Nov 3, 2023 · 0 comments

Comments

@discoleo
Copy link

discoleo commented Nov 3, 2023

Function read_xml_num

Documentation

The function lacks an example and a detailed explanation. Therefore, I am unsure what it actually does or how to actually use it.

All examples generate an error:

Error: result of type: ‘list’, not numeric

Other Issues

This issue is also related to issue:
#356

Example

I try to extract the year from the corresponding node. Some records may miss the value for the year, but it makes sense to process the set sequentially and retrieve an NA. Skipping such nodes would generate frame-shifts and scramble the data (as I am actually interested in the whole record, in this example it includes the Title & Year). The straightforward solution: xml_find_num, but it doesn't look to do this.

library("xml2")

### All Years Present

sx = "<?xml version=\"1.0\" ?>
<ArticleSet>
<a><b>Title 1</b>
	<c>2023</c>
</a>
<a><b>Title 2</b>
	<c>2022</c>
</a>
<a><b>Title 3</b>
	<c>2023</c>
</a>
</ArticleSet>"

x = read_xml(sx)

ns = xml_find_all(x, "/ArticleSet/a")
ns

xml_find_all(ns, ".//c/text()")
xml_find_num(ns, ".//c/text()")
xml_find_num(ns, ".//c")


####################
### One Year Missing

sx = "<?xml version=\"1.0\" ?>
<ArticleSet>
<a><b>Title 1</b>
	<c>2023</c>
</a>
<a><b>Title 2</b>
	<c></c>
</a>
<a><b>Title 3</b>
	<c>2023</c>
</a>
</ArticleSet>"

x = read_xml(sx)

ns = xml_find_all(x, "/ArticleSet/a")
ns

# one Node is missing the Year
# !! find_all actually skips this node !!
xml_find_all(ns, ".//c/text()")
# nodeset with only 2023 & 2023;
xml_text(xml_find_all(ns, ".//c"))
# "2023" ""     "2023"
xml_find_num(ns, ".//c/text()")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant