Questions tagged [extract]

Questions related to retrieving specific information from a (typically minimally structured) data source, such as a web site, media file, source code collection or compressed archive (in which case the desired information is one or more original, uncompressed files). When using this tag, please include additional tags to clarify which specific environment/language/scenario your question refers to.

Data extraction is a term with many different but related meanings, including:

  • Parsing files (such as HTML pages) or file metadata in order to obtain certain information. This often involves

  • Retrieving single frames from audio, video or image files

  • Breaking up functionality in a single source code unit (e.g. a function) into multiple units:

  • Retrieving the original files from a (optionally compressed) archive file, such as a .zip or .tar file.

and should be added as a synonym for this tag.

6876 questions
1
vote
2 answers

Create table from a string in cell by extracting values using regex

I am trying to write a pretty big macro to pull data from one sheet and parse and transform it into different sheets. Going pretty well in everything else. However, I am stuck with 1 part. I have multiline text in a cell (11 lines in this example,…
Achal Desai
  • 93
  • 1
  • 8
1
vote
1 answer

Change the output col. name as per the input file name when copy a specific Col. from multiple csv file and write it to new csv file. Shell Scripting

I'm trying to extract a particular column (4th column) from multiple (over thousands) CSV files and write it new file, in the same manner, the file is sequenced in a folder and first CSV file would provide all four column from the file. Now I'm…
1
vote
1 answer

Set grid cells that do not overlap with polygons to NA

I want to extract those grid cells that are overlapped with the polygons. I used "crop" and "mask", but it not working properly and did not detect all the grid cells correctly. #US shapefile Us <- readOGR(dsn ="C:/shapefile",layer="boundaries") #…
Arwen
  • 77
  • 4
1
vote
2 answers

how to extract first two characters of second word from a column in excel?

I wanted to extract the first two characters of the second word in a column on excel. That was my goal. I have already found the solution and I would like to share with you the solution because I searched on the net and I had not found. Here is the…
1
vote
2 answers

Extract data under subheadings from a text column in SQL Server

I have a text field in a table. The field has subheadings and I want to extract the data under each subheading and create a columns named after the subheading. For example: ID. Text 1 NAME: abc. COMPANY: cuz. ADDRESS: dfg Required…
Sue_sue
  • 13
  • 3
1
vote
1 answer

terra extract hangs, what is my best diagnosis?

I want to extract a spatraster from a spatvector, > rand_samp class : SpatRaster dimensions : 4683, 1869, 1 (nrow, ncol, nlyr) resolution : 40, 40 (x, y) extent : 54689.98, 129450, 5893846, 6081166 (xmin, xmax, ymin, ymax) coord.…
Mathew Vickers
  • 125
  • 1
  • 8
1
vote
1 answer

How can I extract text before a specific character that can occur multiple times in different places within a single cell

I'm trying to find a single formula(preferable, Excel 2016)/VBA solution that would work in my case. Basically I have a column where every cell contains different text that looks like this google (@), microsoft (#), hewlett and packard (@), tesla…
1
vote
1 answer

Extract values from csv file using string keywords in columns and assign values to another csv file

I'm a beginner in learning python. I'm doing data manipulation of csv using pandas. I'm working on two csv files. Extract.csv as the working file and Masterlist.csv as Dictionary. The keywords I'm supposed to use are strings from the Description…
ATD
  • 11
  • 2
1
vote
4 answers

Extract one numbers from string with condition

the below code extract all numbers from string and even combine them. But I need to extract only one whole number with rules: 1- the number is one or two digits (plus the decimal part if it exsists). 2- if the number is followed by " or inch or in ,…
Waleed
  • 847
  • 1
  • 4
  • 18
1
vote
1 answer

Python, using pdfplumber, pdfminer packages extract text from pdf, bolded characters duplicates

Goal: extract Chinese financial report text Implementation: Python pdfplumber/pdfminer package to extract PDF text to txt problem: for PDF text in bold, corresponding extracted text in txt duplicates Examples are as follows: Such as the following…
1
vote
3 answers

How to get the difference between two dates in years, months and days?

I have 2 date values in my FB3 table. STARTDATE = 09/01/2021 ENDDATE = 05/01/2023 I need to get difference between the 2 dates as YEAR, MONTH, DAY According to above dates, the result = YEAR = 1, MONTH = 8, DAYS = 0 I tried SELECT EXTRACT(YEAR…
1
vote
1 answer

Pasting and writing data onto existing Excel file using R

Let's say I already have an existing Excel workbook where the cells are formatted aesthetically. How do I paste the data in a specific column into the Excel workbook that may have merged cells? Using iris as an example, if I want to paste the first…
Sunny League
  • 139
  • 1
  • 8
1
vote
2 answers

Why is pandas.series.str.extract not working here but working elsewhere

Why is a pandas.series.extract(regex) able to print the correct values, but won't assign the value to an existing variable using indexing or np.where. import pandas as pd import numpy as np df = pd.DataFrame( [ ['1', np.nan, np.nan, '1…
DrWhat
  • 2,360
  • 5
  • 19
  • 33
1
vote
0 answers

Excel VBA doesn't name extracted pdf after cell

trying to make macro for active sheet conversion to pdf. I found the code on internet and made little adjustments. Macro is working and extracting active sheet to pdf, but problem come with file name. It doesn't name it after cell SS5(this cell has…
Vokoli S
  • 11
  • 2
1
vote
0 answers

Web scraping: collect chart data

I am completely new to web scraping, and I've decider to go for it, by learning some basis of Python. The data I would like to collect is the chart on the following website :"https://www.amundi-ee.com/entr/product/view/QS0009102334" (no need for a…