-2

I have below type of components in jira.I am trying to create the regex to fetch the value from these components .This value could be anything e.g 1.1 or 1.a or a.a or nothing for any components but remember I don't need anything like R or (U1) just need 1.0,1.1,1.a,a.1,1.x

so I need only the first character or number before period(.) and first character or number after period(.) or if there is no period(.) then blank

                                            need

PCN 9.4U1 (Act)----------------------------1.4  
PCN 9.5 (Act)------------------------------1.5  
PCN 9.6 (Act)------------------------------1.6
R AA 7.5U5 (Arch)--------------------------2.5              
R AA 7.6U2 (Rel)--------------------------2.6               
R AA 37.7R (Arch)---------------------------2.7             
R TEST 1.x (Fut)-------------------------2.x
R testp U2---------------------------------------no value

I am using the below regex to get the value

Fixversionmat = re.findall(r"(\d+\.\d+)", jsonToPython['name'])

but this gives me result only when value is something like that 1.1 and in other cases it fails.

unknown
  • 1,815
  • 3
  • 26
  • 51
  • 2
    If it can be "anything" then how is a regexp supposed to match it? You need to specify what it can be precisely. – Barmar Apr 04 '18 at 22:13
  • @Barmar sorry for this..anything means here it could be number 1.1 or mix of characters 1.a or a.4 or nothing – unknown Apr 04 '18 at 22:15
  • Does it always contain a `.`? Maybe `\S+\.\S+` – Barmar Apr 04 '18 at 22:16
  • @Barmar In most of the cases yes but sometime doesn't contain anything for e.g R olcPa GA – unknown Apr 04 '18 at 22:17
  • So in those cases the regexp won't match, and you can test for that separately. – Barmar Apr 04 '18 at 22:19
  • You can't match *nothing* unless there's a pattern for something before or after it. – Barmar Apr 04 '18 at 22:20
  • @Barmar it makes sense but for now I can exclude this one case – unknown Apr 04 '18 at 22:22
  • 1
    You don't even need regexes on this example. The alphanumeric value is always the rightmost non-parenthesized string which contains any digit. So just throw away everything from '(' on, split on space, iterate right-to-left, and return the first string containing a number. Else return `None`. – smci Apr 04 '18 at 22:38
  • I'm downvoting this question because despite numerous requests you refuse to describe the pattern of your version numbers clearly. Of course, if you knew what the pattern was you probably could write the regexp yourself. – Barmar Apr 04 '18 at 23:05

2 Answers2

3

\S+\.\S+ will two strings of alphanumeric characters separated by ..

Fixversionmatch = re.findall(r"\w+\.\w+", jsonToPython['name'])
Barmar
  • 741,623
  • 53
  • 500
  • 612
0

You get most of your examples with:

(\d[^ \n]*| [a-zA-Z]+?\d[^ \n]*).*$

Link: https://regexr.com/3ncbs

It does not catch version numbers that consist of letters only - it catches mixed ones.

t = """CTX     
CTX 4.0R (Released)     
CTX 4.1 (Released)
CTX 4.2 (Released)
CTX 4.2R2 (FRtRre)  
CTX 4.3 (Released)  
CTX 4.4 (Released)  
CTX 4.4R1 (Active)  
CTX 4.5 (Active)    
CTX 4.6 (Active)
R PX 3.5R3 (Archived)           
R PX 3.5R4 (Archived)               
R PX 3.5R5 (Archived)               
R PX 3.6R2 (Released)               
R PX 3.6R3 (Rnreleased)             
R PX 3.6R4 (Released)               
R PX 3.6R5 (Active)             
R PX 3.7R (Archived)                
R PX 3.7R1 (Released)           
R PX 3.7R2 (Active)             
R PX 3.8R (Released)            
R PX 3.8R1 (Released)               
R PX 3.8R2 (Released)           
R PX 3.8R3 (Released)               
R PX 3.8R4 (Active)
R LPTT GA   
R LPTT R1   
R LPTT R2
R Cianara 4.1R2 (Early Access)
R Cianara 4.x (FRtRre)
R NRnPA R2"""

import re

vers = re.findall(r'(\d[^ \n]*| [a-zA-Z]+?\d[^ \n]*).*$',t,re.MULTILINE)
print(vers)

Output:

['4.0R', '4.1', '4.2', '4.2R2', '4.3', '4.4', '4.4R1', '4.5', 
 '4.6', '3.5R3', '3.5R4', '3.5R5', '3.6R2', '3.6R3', '3.6R4', 
 '3.6R5', '3.7R', '3.7R1', '3.7R2', '3.8R', '3.8R1', '3.8R2', 
 '3.8R3', '3.8R4', ' R1', ' R2', '4.1R2', '4.x', ' R2']

It does not find "empty" ones.

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69