0

I had a doubt with mysql. I have fields that I can easily restrict to some chars, like surname, and others like a mini log with an event particular that I think is no longer 2 than pages (or fewer characters too). I know it's a fairly common question, I see many links about that, but I can't see well what's the difference and the right way.

Here are the conclusions and doubts I have, please correct me if I'm wrong:

-varchar is an variable type, it doesn't matters if I set varchar(10) or varchar(65535), although not the same 65536, which passes to medium text. -varchar with one or other parameters is the same, only takes necessary space, it's only a restriction to save and a restriction to representation.
-if I do the restriction in the side of my application it's exactly the same and more flexible.
-text is of fixed length, if I set text it will be occupy 65535, both I use it or not.

varchar
-it has an overloaded of having to find first what's the length of field.
-it has a cost of one byte more where the length is saved.
-it is save inline, that means in the table.
-it has better performance, always.

text
-it is stored outside the table, there is a pointer to my text, is that true?
-it have a fixed number of characters.

-the lenght that is setted both in varchar and text is in characters, not in bytes (previously versions was different)

Conclution:

-For these reasons it will be better always use a varchar, there is only one case that's not true, wich is in case of I know exactly the numbers of characters.
All the other cases, which are the most commons, will be better to use a longest varchar, therefore it is to names or surnames (although I can restrict it to text(35), but where I don't know if the name is "Tom" or "Federico Guillermo Rodrigo Cayetano"), as for a field matter or desctription (more variable), or for a text more bigger like a mini log, who can vary from 0 to 3 pages of characters (that's in my case, but we can take it to an extreme like 0 to 16 pages of text, around 65535 characters). This is right especially if we can not know future requisites, like a longest description, if we restrict the table space, a change of requirements can be a hard problem.

Finally, like a bonus :), What happens with a text more than 65535 characters but who can vary?, it's reasonable to set it to medium text, or longtext if it can vary?

Rulolp
  • 21
  • 5
  • Sure, it is reasonable to do that. – Jay Blanchard Jan 11 '16 at 21:06
  • If varchar is always better at performance, why would using text be an advantage at any point, even knowing the exact length of the text? – Phiter Jan 11 '16 at 21:09
  • "I have fields that I can easily restrict to some chars, like surname" Well, that's probably going to be an assumption that comes back to bite you later. Don't forget that [most programmers have many false conceptions about names](http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/). – tadman Jan 11 '16 at 21:12
  • Why does this question have a `php` tag? – VolkerK Jan 11 '16 at 21:20
  • see also: http://dba.stackexchange.com/questions/424/performance-implications-of-mysql-varchar-sizes – VolkerK Jan 11 '16 at 21:21
  • PhiterFernandes hahaha, yes, I think it when I hear it has better performance over there (maybe for the inline storage?), but I need a reason for the mysql developers don't erase that. @tadman Yes, for that I put the example of the two names, but it's very rarely to find a fixed character space, even in passwords fields, if you need 40 characters, recommended is put more, like 60, or even better 255 for futures changes in encryption. Then, I don't find any place to put a fixed text, or a little varchar, that is what I understand so far. – Rulolp Jan 11 '16 at 21:28
  • The storage cost difference between `VARCHAR(255)` and `VARCHAR(20)` for a 10-character string is pretty much the same. The only time you need to aggressively trim your fields is when you can measure the impact in terms of gigabytes saved, something that's very rarely the case. – tadman Jan 11 '16 at 21:30

1 Answers1

0

I think your assumptions there aren't entirely true and might be overly paranoid or tainted by historical concerns that are no longer valid.

Generally it's best to use VARCHAR because MySQL, and other database engines, have been optimized to handle them quite well. Historically speaking (1990s) there were significant performance differences when using tables with fixed-length rows vs. variable length but that's of little concern today.

Likewise, TEXT columns used to have all their data stored external from the primary table data. Now I believe there's a hybrid approach used, with the first N characters stored in the row and the remainder stored in the BLOB system, if necessary. It is variable length by definition, a single character does not use 65535 bytes to store.

You'll need to use LONGTEXT for much larger blocks of text, but at that point you should evaluate if you're using your database correctly. Storing, retrieving and manipulating huge objects in there can be a serious hassle.

I'd use VARCHAR(255) for anything "text" by default and extend it longer if necessary. Shortening that is generally problematic except for very narrow cases, like where you're storing an ISO country code that will, by definition be two characters.

Presuming names are "no more than 32 characters" is almost always a mistaken assumption. Leave these wide open unless you've got a very good reason to do otherwise.

You might think that things like city or place names are short, too, but you'd be wrong. The world is a strange place. Expect the unexpected.

tadman
  • 208,517
  • 23
  • 234
  • 262
  • "I'd use VARCHAR(255) for anything "text" by default and extend it longer if necessary." That's kind of affirmation I was searching :) I don't understand why you say "The storage cost difference between VARCHAR(255) and VARCHAR(20) is pretty much the same", I thought it's the exactly same thing, 20, 255 or 10000. And for the mini log, about 0 to 10.000 characters maximum, I will put VARCHAR(10000) then. Thanks for the links, was very illustrative :) – Rulolp Jan 12 '16 at 14:31
  • There's [a list of storage requirements](http://dev.mysql.com/doc/refman/5.7/en/storage-requirements.html) in the MySQL documentation that's worth reading. If you're dealing with billions of records and every byte counts this is important. In general practice it's not as relevant. 10,000 characters seems like an arbitrary cut-off, too, which might surprise your users. For "long, freeform chunks of text", such as you might find in a person's bio or blog post, 10K is probably not enough. For a name it's probably way too much. For those cases where you just don't know, `LONGTEXT` is a safer bet. – tadman Jan 13 '16 at 17:12