0

I have a vector of strings from a survey, and that vector has info about the job position of the people. Some of the replies are: ceo, Ceo, CEO, ceo/owner, ceo/founder.

I want to replace any strings that contain the word ceo (upper case, lower case, space after, space before) with CEO.

I have tried these codes, and they replace some, but not all of them.

aiprm$job_title <- gsub("ceo|CEO|owner|Ceo|Owner|executive|CEO |CEo|CE0|CEO/CEO|\\ceo|\\CEO", "CEO",aiprm$job_title)

There still missing some like: Broker / CEO , Business CEO/ Health Coach, CEO& Producer, CEO/ Creative Director, CEO/Designer, CEO/Operator.

Mark
  • 7,785
  • 2
  • 14
  • 34
Humberto R
  • 19
  • 4
  • 2
    It might be helpful if you provide other strings, including some that should not match in addition to these that should match and do not. – r2evans Aug 23 '23 at 21:05
  • 1
    can you do `replace(aiprm$job_title, grepl("\\bCEO\\b", aiprm$job_title, ignore.case=TRUE), "CEO")` or `gsub(".*\\bCEO\\b.*", "CEO", aiprm$job_title, ignore.case=TRUE)`? – r2evans Aug 23 '23 at 21:06
  • Hello https://stackoverflow.com/users/3358272/r2evans thanks! Here is the data. https://drive.google.com/file/d/1_ywESP2cYat7y42VpzmPl-Gv6HYTQmFX/view?usp=drive_link – Humberto R Aug 25 '23 at 19:03

1 Answers1

0

grep to find the positions where "ceo" is matched (case-insens) and replace the whole thing.

quux[grep("ceo", quux$job_title, ignore.case = TRUE),1:2]
#          job_title                                 industry
# 14             CEO                              Advertising
# 28             CEO                         Marketing Agency
# 64             CEO                                Education
# 70   Founder & CEO                            AI Consulting
# 81             CEO                         ZK ART criations
# 83             CEO                                Marketing
# 110            Ceo                        Digital marketing
# 111            CEO                               Web Design
# 120            CEO                Marketing & Advertisement
# 124            CEO                                Trainings
# 125            CEO                               Healthcare
# 128            CEO                             consultation
# 132            CEO                              IT-Services
# 144            Ceo                                    Media
# 167            CEO                    BRANDING AND PRINTING
# 176            CEO                        Civil Engineering
# 180            ceo                                 software
# 195            ceo                        marketing digital
# 210 CEO & Producer          Comm (Rádio and Audio Producer)
# 217            CEO                                 services
# 253            CEO             Trucking; Travel, e-commerce
# 256            CEO                                     Home
# 262 President, CEO                    Management Consulting
# 272            CEO       Short Term Rentals and Hospitality
# 280            ceo                              eletrônicos
# 285      Ceo/owner                             Entrepreneur
# 312            CEO Nonprofit- services for people with I/DD
# 316            Ceo                                Marketing
# 321  Founder & CEO                            Digital Media
# 330            CEO                                       PR
# 333            CEO                                Marketing
# 337            ceo                                     agri
# 359            CEO                                Marketing
# 366            CEO                                    Media
# 378            CEO                         IT/SocialNetwork
# 404            CEO                 Digital Marketing Agency
# 419            CEO                                Publicité
# 431            Ceo                                      Ceo
# 439            CEO                                     SaaS
# 442            CEO                        Digital Marketing
# 443            CEO          wellness and health, Real State
# 445      Owner/ceo                               Disability
# 452            CEO            Advising and Entrepreneurship
# 453            CEO                     KCARBONFREE G-W-RBIO
# 458            CEO                                 Software
quux$job_title[grepl("ceo", quux$job_title, ignore.case = TRUE)] <- "CEO"
quux[grep("ceo", quux$job_title, ignore.case = TRUE),1:2]
#     job_title                                 industry
# 14        CEO                              Advertising
# 28        CEO                         Marketing Agency
# 64        CEO                                Education
# 70        CEO                            AI Consulting
# 81        CEO                         ZK ART criations
# 83        CEO                                Marketing
# 110       CEO                        Digital marketing
# 111       CEO                               Web Design
# 120       CEO                Marketing & Advertisement
# 124       CEO                                Trainings
# 125       CEO                               Healthcare
# 128       CEO                             consultation
# 132       CEO                              IT-Services
# 144       CEO                                    Media
# 167       CEO                    BRANDING AND PRINTING
# 176       CEO                        Civil Engineering
# 180       CEO                                 software
# 195       CEO                        marketing digital
# 210       CEO          Comm (Rádio and Audio Producer)
# 217       CEO                                 services
# 253       CEO             Trucking; Travel, e-commerce
# 256       CEO                                     Home
# 262       CEO                    Management Consulting
# 272       CEO       Short Term Rentals and Hospitality
# 280       CEO                              eletrônicos
# 285       CEO                             Entrepreneur
# 312       CEO Nonprofit- services for people with I/DD
# 316       CEO                                Marketing
# 321       CEO                            Digital Media
# 330       CEO                                       PR
# 333       CEO                                Marketing
# 337       CEO                                     agri
# 359       CEO                                Marketing
# 366       CEO                                    Media
# 378       CEO                         IT/SocialNetwork
# 404       CEO                 Digital Marketing Agency
# 419       CEO                                Publicité
# 431       CEO                                      Ceo
# 439       CEO                                     SaaS
# 442       CEO                        Digital Marketing
# 443       CEO          wellness and health, Real State
# 445       CEO                               Disability
# 452       CEO            Advising and Entrepreneurship
# 453       CEO                     KCARBONFREE G-W-RBIO
# 458       CEO                                 Software

This is a sample of the data used for above, "all of it" is too big for a Stack answer.

quux <- structure(list(job_title = c("Marketing Director", "Digital Content Manager", "Owner", "Content Principal", "Chief Consultant", "Managing Director", "Senior SEO Analyst", "Senior SEO Specialist", "Head of Content", "", "", "", "content manager", "CEO", "Head of SEO", "Owner/SEO Consultant", "Snr Manager, SEO + Talent", "Head of Web and Digital Communications", "VP of SEO & Content", "SEO consultant"), industry = c("Sporting Goods", "Marketing and Advertising", "Software", "Tech", "Business technology services",  "Marketing", "SEO", "Automotive", "Marketing", "", "", "", "entertainment", "Advertising", "Digital Marketing", "Marketing", "SEO", "Online publishing", "private equity / finance", "SEO")), row.names = c(NA, 20L), class = "data.frame")
r2evans
  • 141,215
  • 6
  • 77
  • 149