0

All my column headers in a MySQL database are prefixed with a number, 1_X, 2_X, etc... which makes bringing the data into IDL impossible using just a basic select statement to bring in the entire table. I'm not sure but I see two possible ways:

1) Bring in the table with column name aliases. Can I use TRIM or SUBSTRING_INDEX to remove/replace the first two characters?

2) Create a routine that uses the information schema to to recursively go through and delete the first two characters of the column headers and create a new table with those headers and copy the data in.

If there weren't so many different tables (all with 1_X, 2_X, etc...) there'd be no problem manually selecting 1_X AS X but that's not feasible. It would be great to be able to use TRIM/SUBSTRING on column headers in the select statement.

Thanks.

3 Answers3

0

I think you can follow option 2. However this will not be quick solution.

Another way around this could be,

  1. Generate schema script for the tables you want to correct.
  2. Open the script in notepad++ or any editor that supports find using regular expression.
  3. Search and replace with [0-9]+_ expression and empty string for replacement.
  4. Create the new tables using this script and copy data into them.

This may sound like a manual approach but you will do this once for all of your tables.

Riz
  • 1,119
  • 15
  • 23
  • There's a potential for creating duplicate (or even zero length) column names with stripping off leading digit characters. – spencer7593 Mar 27 '14 at 22:46
  • That would be great, straight forward and easy but I need it to work in the sql command as new tables are being created with leading numbers regularly (not often but enough). I need it to be as automated as possible if I'm not around to help in the future. – user3306708 Mar 27 '14 at 23:13
0

Look into a strategy of doing 2 selects, one for the column name, then one for the data with column alias. You might have to revert to some scripting language, like PHP, for help.

First, get the column names :

show columns from tbl_client;
+-------------------------------+-----------------------------------+------+-----+---------------------+-----------------------------+
| Field                         | Type                              | Null | Key | Default             | Extra                       |
+-------------------------------+-----------------------------------+------+-----+---------------------+-----------------------------+
| 1_X                           | int(11)                           | NO   | PRI | NULL                | auto_increment              |

Then, loop through the results and create a list of column alias

Then create your new select

SELECT 1_X as NEW_COLUMN_NAME_FOR_FIELD_1 FROM tbl_client;
crafter
  • 6,246
  • 1
  • 34
  • 46
0

It's not possible to use functions in a SQL statement to alter the identifier assigned to a column being returned. The SQL way of specifying the identifier for the column in a resultset is to use the expr AS alias approach.


Rather than trim off the leading digit characters, you could prepend the identifiers with another valid character. (Trimming off leading characters seems like it would potentially lead to another problem, duplicate and/or zero length column names.)

You could just use a SQL statement to generate the SELECT list for you.

(NOTE: the GROUP_CONCAT function is limited by some system/session variables: group_concat_max_len and max_allowed_packet, it's easy enough to adjust these higher, though changing global max_allowed_packet may require MySQL to be restarted.)

To get it back the SELECT list on all one line (assuming you won't overflow the GROUP_CONCAT limits) something like:

SELECT c.table_schema
     , c.table_name
     , GROUP_CONCAT(
          CONCAT('t.`',c.column_name,'` AS `x',c.column_name,'`')
          ORDER BY c.ordinal_position
       ) AS select_list_expr
  FROM information_schema.columns c
  FROM information_schema.columns c
 WHERE c.table_schema = 'mydatabase'
 GROUP BY c.table_schema, c.table_name

Or, you could even get back a whole SELECT statement, if you wrapped that GROUP_CONCAT expression (which produces the select list) in another CONCAT

Something like this:

SELECT CONCAT('SELECT '
          , GROUP_CONCAT(
               <select_list_expr>
            )
          , ' FROM `',c.table_schema,'`.`',c.table_name,'` t;'
       ) AS stmt
  FROM information_schema.columns c
 WHERE c.table_schema = 'mydatabase'
 GROUP BY c.table_schema, c.table_name

You could use a more clever expression for <select_list_expr>, to check for leading "digit" characters, and assign an alias to just those columns that need it, and leave the other columns unchanged, though that again introduces the potential for returning duplicate column names.

That is, if you already have columns named '1_X' and 'x1_X' in the same table. But a carefully chosen leading character may avoid that problem...

The <select_list_expr> could be more clever by doing a conditional test for leading digit character, something like this:

SELECT CONCAT('SELECT '
          , GROUP_CONCAT(
               CASE
               WHEN c.column_name REGEXP '^[[:digit:]]'
               THEN CONCAT('t.`',c.column_name,'` AS `x',c.column_name,'`')
               ELSE CONCAT('t.`',c.column_name,'`')
               END
           )
          , ' FROM `',c.table_schema,'`.`',c.table_name,'` t;'
       ) AS stmt
  FROM information_schema.columns c
 WHERE c.table_schema = 'mydatabase'
 GROUP BY c.table_schema, c.table_name

Again, there's a potential for generation "duplicate" column names with this approach. The conditional test "c.column_name REGEXP" could be extended to check for other "invalid" leading characters as well.


As a side note, at some point, someone thought it a "good idea" to name columns with leading digit characters. Just because something is allowed doesn't mean it's a good idea.


Then again, maybe all that rigamarole isn't necessary, and just wrapping the column names in backticks would be sufficient for your application.

spencer7593
  • 106,611
  • 15
  • 112
  • 140
  • prepending would work just fine as well, this looks promising and I can't wait to get to work on this. the program that i use to feed CSV data into the mysql database has been using this format for years and i now have many thousands of columns with leading digits with no end in sight. i'll double check on bringing in backticks. – user3306708 Mar 27 '14 at 23:09
  • I'm trying to figure this out on my own but I'm having trouble as I'm also trying to learn more about CONCAT. This works almost perfect but I'm unable to run this query on an individual table. – user3306708 Mar 28 '14 at 17:14
  • You could add just "` AND c.table_name = 'mytablename'`". I'm not sure I was clear, this query doesn't run against any particular table, it just produces a SQL statement that will extract rows from a table. The SELECT statement it produces could be used as part of a CREATE VIEW statement, but I didn't (and don't) recommend that because you don't want to introduce performance killing views into your MySQL schema. – spencer7593 Mar 28 '14 at 17:58
  • I am assuming that there's some reason you can't change the names of the existing columns using ALTER TABLE statements (i.e. there's a bunch of SQL already running that uses the existing column names, and that would break), and some reason your "IDL" disallows wrapping column names in backticks, or disallows ANSI quote mode and wrapping the column names in double quotes. (The query in my answer just demonstrates how to extract the names of columns for a table, and use string functions to manipulate that into something you can use. – spencer7593 Mar 28 '14 at 18:04