Friday, November 3, 2017

How to Name Things

Use the plural for packages with homogeneous contents and the singular for packages with heterogeneous contents.

A class is similar to a database relation. A database relation should be named in the singular as its records are considered to be instances of the relation. The function of a relation is to compose a complex record from simple data.

A package, on the other hand, is not a data abstraction. It assists with organization of code and resolution of naming conflicts. If a package is named in the singular, it doesn't mean that each member of the package is an instance of the package; it contains related but heterogeneous concepts. If it is named in the plural (as they often are), I would expect that the package contains homogeneous concepts.

For example, a type should be named TaskCollection instead of TasksCollection, as it is a collection containing instances of a Task. A package named com.myproject.task does not mean that each contained class is an instance of a task. There might be a TaskHandler, a TaskFactory, etc. A package named com.myproject.tasks, however, would contain different types that are all tasks: TakeOutGarbageTaskDoTheDishesTask, etc.
  1. Don't reuse same variable name in the same class in different contexts: e.g. in method, constructor, class. So you can provide more simplicity for understandability and maintainability.
  2. Don't use same variable for different purposes in a method, conditional etc. Create a new and different named variable instead. This is also important for maintainability and readability.
  3. Don't use non-ASCII chars in variable names. Those may run on your platform but may not on others.
Abbreviations are to be avoided, except for acronyms and certain common abbreviations

Input parameters are a special kind of local variable. They should be named much more carefully than ordinary local variables, as their names are an integral part of their method’s documentation

Type parameter names usually consist of a single letter. Most commonly it is one of these five: T for an arbitrary type, E for the element type of a collection, K and V for the key and value types of a map, and X for an exception. The return type of a function is usually R. A sequence of arbitrary types can be T, U, V or T1, T2, T3.
1. Omit words that are obvious given a variable’s or parameter’s type
2. Omit words that don’t disambiguate the name
3. Omit words that are known from the surrounding context
4. Omit words that don’t mean much of anything
  • Use complete names where an object is referenced from more than one place. Abbreviate for local scope only (for example, Enumeration e = getEnumeration()) Bad idea, use pronounceable names for all variables, is it really so tough to type "each" rather than "e", readability is more important.
    • [Does using full words really improve readability for local variables' names? Is function(first, second) {return first < second;} easier to read than function(a,b) {return a < b;}, for example? (Or are there better choices than either of those?) -DavidMcLean]
      • Each is readable, neither is descriptive as to what it is. What it does is. The use of complete names as indicated above was with regard to "an object .. referenced from more than one place". I take that to mean outside of local scope. Local variables are not addressable directly from outside. They may be via calling parameters however.
      • I prefer names over single characters when referring to understandable by name artifacts which can acted upon by artifactories to produce other understandable by name artifacts . single characters are more readable when used in expressions like a=(x*b)+c
      • example: DinnerTimeFood = function Recipe(ingredients, instructions) is more Humanly readable than is -> a = function b(c,d) -- DonaldNoyes
Instead of describing the operation that triggered the exception, it would probably be better to describe the reason: for example DuplicateUserException("can't add user because it already exists")

What I would do will be, use UserManagementException instead of too many names, and specify the exact cause of it in some message or error code defined additionally in the class.
I have found more exceptions with the form <problem><noun>Exception. For example, IllegalArgumentException. So that's what I've decided to use in the future.
CanRetrieve sounds fine to me. I've seen the Can stem used in Microsoft APIs. The only other real option IMO is IsRetrievable (from Aziz) which somehow seems too linguistically twisted!
In .NET, you often have pairs of methods where one of them might throw an exception (DoStuff), and the other returns a Boolean status and, on successful execution, the actual result via an out parameter (TryDoStuff).
(Microsoft calls this the "Try-Parse Pattern", since perhaps the most prominent example for it are the TryParse methods of various primitive types.)
If the Try prefix is uncommon in your language, then you probably shouldn't use it.

What if you simply threw the exception to the calling code?
This way you are delegating the exception handling to who ever is using your code. What if, in the future you would like to do the following:
  • If no exceptions are thrown, take action A
  • If (for instance) a FileNotFoundException is thrown, take action B
  • If any other exception is thrown, take action C
If you throw back your exception, the above change would simply entail the addition of an extra catch block. If you leave it as is, you would need to change the method and the place from which the method is called, which, depending on the complexity of the project, can be in multiple locations.
✓ DO use the prefix "Try" and Boolean return type for methods implementing this pattern.
✓ DO provide an exception-throwing member for each member using the Try-Parse Pattern.
This time I will start with a code sample. Take a look at this:
if (code.isComplexOrUnreadable()) {

Let me ask you another question then. How do you think the implementation of isComplexOrUnreadable() method looks like? 
I assume that many of you would imagine something similar to this:
boolean isComplexOrUnreadable() {
    return complex || unreadable;
The first thing regards duplication of information, which should be now obvious for you. The second thing relates to the lack of information.

Don’t Repeat Yourself

We repeat information about the way method was implemented. This is definitely something we should avoid. 

Do you remember the DRY principle? We won’t repeat the code, because if we do so, we will have to remember to update two or more places in case of any change.

What would happen if, after some time, we decided to change the implementation of the method body?
boolean isComplexOrUnreadable() {
    return (complex || unreadable) && tests.exist();

Would it be ok to leave the same name of the method? No, because the name would become misleading. The name of the method would give you invalid information. 
Any change in implementation would always affect the name of the method. So in our case we would have to rename the method into something like this:
boolean isComplexOrUnreadableWithTests() {
    return (complex || unreadable) && tests.exist();

But wait! Now it is still a little bit confusing because after reading the name of the method we may have a false impression that implementation looks like that:
return complex || unreadable && tests.exist();

Well, as you can see even kind of a “correct” name can be misleading in the situation when the name reflects implementation.
As you can see now, the name that describes HOW and not WHY can be a huge problem and can lead to other problems that degrade the quality of your code. This is not what we, as a professionals, would like to do. 

To summarize, the name of the method or class that express intention of the code is a right direction, because:
  • You know what place should be modified if a change is needed.
  • You are not duplicating the same information (how the method was implemented).
  • You are not losing information what is the reason of method/class existence.
  • You know what question is answered by method’s invocation.
  • It is extremely easy to reuse the code in case the same question must be asked in different place.
  • You are not duplicating the code, because you simply don’t answer once again to the same question.

And at the end, this is how the code can look like if its name would express the intention:
if (code.canBeImproved()) {
the name is wrong because of its direct relation to the implementation. However, this is not the only problem. The use of conjunctions in the method’s name is a sign that we could not find a right name and we just list all known things that we’ve done. It does not matter whether this list is implementation- or logic-related.

  • Conjunctions - we are talking so much about Single Responsibility Principle and it is important to apply this principle when we are writing code. And aren’t conjunctions the sign that SRP is not followed? When we use words like “and” or “or”, we usually talk about more than one thing.
    Whenever you spot a conjunction in the name of your variable, method or class, you should treat it as a warning. There is a strong chance that improvement is needed.
  • Body change leads to name change - if the change in the code does not change the whole rationale behind functionality and yet still requires changing the name of the method/class, that’s a sign that probably the name does not express the true intention.
There is more to a program than a compiler understanding what you’ve written and running the code. To create software, writing the code is one thing, maintaining the code is another. Sometimes, maintaining existing code is more difficult than creating a new one. We are afraid we may bring the project down.

  1. Make sure you use names that show your intentions. Names should explain what they do, why they do what they do, and how they do it.
  2. Endeavor to use names that carry single meaning. Do not use names that might carry very similar meanings to another thing.
  3. When there are differences between names, please make sure the distinction is clear.
  4. Ensure that names you use are easy to pronounce. Beyond communicating with the compiler, you want to be able to communicate with other developers about your project. So, use names that are easy to pronounce and reference.
  5. Use domain-specific namings properly. Use names that easily map to the domain in which you are working, and which also explains exactly how the naming works in that domain.
  6. Do not repeat namings. Ensure that not every variable you create is annotated with a certain prefix such that every name in your program has that name. This could lead to poor searching on your code.
  7. Endeavor to reach out more to the language you are using. Learn the vocabulary of the English language to help communicate more. The more words you know, the easier it is for you to communicate with other programmers.
Example #1: Dude, That Method Name Is Way Too Long

public static Object checkWidgetForAValidCustomerAndAccountAndContactRecord(Integer customerNumber) throws Exception
Rule of thumb: Keep your method names descriptive, but shorter rather than longer
An important rule is missing here: variables should be named with their scope in mind. So if a variable is longer lived and has larger scope its name should be that much more descriptive because when you're looking at it the only thing that will tie the value of the variable to the context within which it can be used is its name.
So 'i' is fine for a loop control variable with a scope of five lines but totally inadequate for something expressing a larger and longer lived concept.
Ditto for function names and parameters to functions, if the function and the parameters are named properly understanding the function is trivial.
So if you write a chunk of code that exports one or more functions that is where your effort should go, that's the public interface. The reduced scope of the rest of the code should make any naming issues much more limited.

An expressive name for a software object must be clear, precise, and small. 
Use Intention-revealing Names
We often see comments used where better naming would be appropriate:
 int d; // elapsed time in days
Such a comment is an excuse for not using a better variable name. The name 'd' doesn't evoke a sense
of time, nor does the comment describe what time interval it represents. It requires a change:
 int elapsedTimeInDays;
 int daysSinceCreation;
 int daysSinceModifica
The problem isn't the simplicity of the code but the implicity of the code: the degree to which the
context is not explicit in the code itself

Avoid Disinformation
A software author must avoid leaving false clues which obscure the meaning of code.
Do not refer to a grouping of accounts as an AccountList unless it's actually a list. The word
list means something specific to CS people. If the container holding the accounts is not actually a
list, it may lead to false conclusions. AccountGroup or BunchOfAccounts would have been
Beware of using names which vary in small ways. How long does it take to spot the subtle difference
between a XYZControllerForEfficientHandlingOfStrings in one module and,
somewhere a little more distant XYZControllerForEfficientStorageOfStrings? The
words have frightfully similar shape.

It is nice if names for very similar things sort together alphabetically, and if
the differences are very, very obvious since the developer is likely to pick an object by name without
seeing your copious comments or even the list of methods supplied by that class.

Make Meaningful Distinctions
Noise words are another meaningless distinction. Imagine that you have a Product class. If you have
another called ProductInfo or ProductData, you have made the names different without making
them mean anything different. Info and Data are indistinct noise words like "a", "an" and "the".
Noise words are redundant. The word variable should never appear in a variable name. The word
table should never appear in a table name. How is NameString better than Name? Would a Name
ever be a floating point number? If so, it breaks an earlier rule about disinformation. Imagine finding
one class named Customer and another named CustomerObject, what should you understand as the
distinction? Which one will represent the best path to a customer's payment history?

Disambiguate in such a way that the reader knows what the different versions offer her, instead of
merely that they're different.

Use Pronounceable Names
A company I know has genymdhms (generation date, year, month, day, hour, minute and second) so
they walked around saying "gen why emm dee aich emm ess". I have an annoying habit of pronouncing
everything as-written, so I started saying "gen-yah-mudda-hims". It later was being called this by a host
of designers and analysts, and we still sounded silly. But we were in on the joke, so it was fun. Fun or
not, we were tolerating poor naming. New developers had to have the variables explained to them, and
then they spoke about it in silly made-up words instead of using proper English terms.
class DtaRcrd102 {
 private Date genymdhms;
 private Date modymdhms;
 private final String pszqint = "102";
 /* ... */

class Customer {
 private Date generationTimestamp;
 private Date modificationTimestamp;;
 private final String recordId = "102";
 /* ... */

Use Searchable Names
Single-letter names and numeric constants have a particular problem in that they are not easy to locate
across a body of text.

My personal preference is that single-letter names can ONLY be used as local variables inside short
methods. The length of a name should somehow correspond to the size of its scope. If a variable or
constant might be seen or used in multiple places in a body of code it is imperative to give it a search-
friendly name.

Avoid Encodings
Avoid Mental Mapping
Smart is overrated. Clarity is king. The very smart must use their talent to write code that others are
less likely to misunderstand.
Use Noun and Verb Phrases
Classes and objects should have noun or noun phrase names.

Other methods (sometimes called "mutators", though not so commonly anymore) cause something to
happen. These represent a small "transaction" on the object (and generally should be a complete
action). Mutators should have verb or verb-phrase names. This way, changing a name would read:
You will notice that the above line reads more like a sentence than a lot of code. It leaves a dangling
"to", which is completed by the parameter. The intention is to make the parameter list more obvious so
that it is harder to make foolish errors.
Another trend is to use a named creation function, rather than yet another overloaded constructor. It can
be more obvious to read the creation of a complex number using
than using the constructor version
 new Complex(23.0);

As a class designer, does this sound boringly unimportant? If so, then go write code that uses your
classes. The best way to test an interface is to use it and look for ugly, contrived, or confusing text. The
most popular way to do this in the 21st century is to write Unit Tests for the module. If you have
trouble reading the tests (or your partners do) then rework is in order.
Over time we've found that this rule extends even to constructors. Rather than having a lot of
overloaded constructors and having to chose among them by their parameter lists, we frequently create
named creation functions as class (static) methods.
Don't Be Cute

Pick One Word Per Concept
Pick one word for one abstract function and stick with it. 

A consistent lexicon is a great boon to programmers who must use your classes, even if it may seem
like a pain while developing the classes. If you have to, you can write it into a wiki page or a document,
but then it must be maintained. Don't create documents lightly.

you want your readers to afford some lazy reading and
assumptions. You want your code to be a quick skim, not an intense study. You want to use the popular
paperback model whereby the author is responsible for making himself clear and not the academic
model where it is the scholar's job to dig the meaning out of the paper.

Use Solution Domain Names
Solution domain names are appropriate only if you are working at a low-level where the solution
domain terms completely describe the work you are doing. For work at a higher level of abstraction,
you should Use Problem Domain Names

Make Context Meaningful
Add Meaningful Context
If you have a number of variables with the same prefix (address_firstName, address_lastname,
address_Street), it can be a pretty clear clue that you need to create a class for them to live in
Don't add Gratuitous Context

Shorter names are generally better than longer ones, if they are clear. Add no more context to a name
than is necessary.
The names `accountAddress' and `customerAddress' are fine names for instances of the class
Address but could be poor names for classes. Address is a fine name for a class. If I need to
differentiate between MAC addresses, port addresses, and web addresses, I might consider
PostalAddress, MAC, and URI. The resulting names are more precise, and isn't precision the point of
all naming?

Meaningful Names

This factory will be an interface and will be implemented by a concrete class. What should you name them? IShapeFactory and ShapeFactory? I prefer to leave interfaces unadorned. The preceding I, so common in today’s legacy wads, is a distraction at best and too much information at worst. I don’t want my users knowing that I’m handing them an interface. I just want them to know that it’s a ShapeFactory. So if I must encode either the interface or the implementation, I choose the implementation. Calling it ShapeFactoryImp, or even the hideous CShapeFactory, is preferable to encoding the interface.

Classes and objects should have noun or noun phrase names like Customer, WikiPage, Account, and AddressParser. Avoid words like Manager, Processor, Data, or Info in the name of a class. A class name should not be a verb.

Methods should have verb or verb phrase names like postPayment, deletePage, or save. Accessors, mutators, and predicates should be named for their value and prefixed with get, set, and is according to the javabean standard

When constructors are overloaded, use static factory methods with names that describe the arguments


The power of variable names

A good mnemonic name generally speaks to the problem rather than the solution. A good name tends to express the what more than the how. In general, if a name refers to some aspect of computing rather than to the problem, it's a how rather than a what. Avoid such a name in favor of a name that refers to the problem itself.

A record of employee data could be called inputRec or employeeData. inputRec is a computer term that refers to computing ideas—input and record. employeeData refers to the problem domain rather than the computing universe. Similarly, for a bit field indicating printer status, bitFlag is a more computerish name than printerReady. In an accounting application, calcVal is more computerish than sum.

Names that are too long are hard to type and can obscure the visual structure of a program.
numTeamMembers, teamMemberCount
numSeatsInStadium, seatCount
teamPointsMax, pointsRecord

Programs with names averaging 8 to 20 characters were almost as easy to debug.


A programmer reading such a variable should be able to assume that its value isn't used outside a few lines of code
longer names are better for rarely used variables or global variables and shorter names are better for local variables or loop variables

If you modify a name with a qualifier like Total, Sum, Average, Max, Min, Record, String, or Pointer, put the modifier at the end of the name.
An exception to the rule that computed values go at the end of the name is the customary position of the Num qualifier. Placed at the beginning of a variable name, Num refers to a total: numCustomers is the total number of customers. Placed at the end of the variable name, Num refers to an index: customerNum is the number of the current customer. The s at the end of numCustomers is another tip-off about the difference in meaning. But, because using Num so often creates confusion, it's probably best to sidestep the whole issue by using Count or Total to refer to a total number of customers and Index to refer to a specific customer. Thus, customerCount is the total number of customers and customerIndex refers to a specific customer.

If a variable is to be used outside the loop, it should be given a name more meaningful than i, j, or k.
If you have several nested loops, assign longer names to the loop variables to improve readability.
many experienced programmers avoid names like i altogether.

Use positive boolean variable names. Negative names like notFound, notdone, and notSuccessful are difficult to read when they are negated
if not notFound
When naming constants, name the abstract entity the constant represents rather than the number the constant refers to.

Creating Short Names That Are Readable
Kinds of Names to Avoid
Avoid names with similar meanings.
input and inputValue, recordNum and numRecords, and fileNumber and fileIndex are so semantically similar that if you use them in the same piece of code you'll easily confuse them and install some subtle, hard-to-find errors.

Avoid names that sound similar, such as wrap and rap

Avoid variables with different meanings but similar names.
Have at least two-letter differences between names, or put the differences at the beginning or at the end. clientRecords and clientReports are better than the original names.

Avoid numerals in names
Avoid misspelled words in names.
In general, it is better to be too descriptive than too terse, but always consider the scope that the variable will exist in. Short names are preferable in smaller scopes, while longer names are more appropriate for longer-lived objects.
Larger-scoped variables require longer and more descriptive names:
private CommandProcessor sequentialCommandProcessor =
    new CommandProcessor();
All "clients" look similar: They encapsulate the destination URL with some access credentials and expose a number of methods, which transport the data to/from the "server." Even though this design looks like a proper object, it doesn't really follow the true spirit of object-orientation. That's why it's not as maintainable as it should be, for two reasons:
  • Its scope is too broad. Since the client is an abstraction of a server, it inevitably has to represent the server's entire functionality. When the functionality is rather limited there is no issue. Take HttpClient from Apache HttpComponents as an example. However, when the server is more complex, the size of the client also grows. There are over 160 (!) methods in AmazonS3Client at the time of writing, while it started with only a few dozen just a few yearshundred versions ago.
  • It is data focused. The very idea of a client-server relationship is about transferring data. Take the HTTP RESTful API of the AWS S3 service as an example. There are entities on the AWS side: buckets, objects, versions, access control policies, etc., and the server turns them into JSON/XML data. Then the data comes to us and the client on our side deals with JSON or XML. It inevitably remains data for us and never really becomes buckets, objects, or versions.
  • Extendability issues. Needless to say, it's almost impossible to decorate a client object when it has 160+ methods and keeps on growing. The only possible way to add new functionality to it is by creating new methods. Eventually, we get a monster class that can't be reused anyhow without modification.
What is the alternative?
The right design would be to replace "clients" with client-side objects that represent entities of the server side, not the entire server. For example, with the S3 SDK, that could be BucketObjectVersionPolicy, etc. Each of them exposes the functionality of real bucketsobjects and versions, which the AWS S3 can expose.
The right design would be to replace clients with client-side objects that represent entities of the server side.
Of course, we will need a high-level object that somehow represents the entire API/server, but it should be small. For example, in the S3 SDK example it could be called Region, which means the entire AWS region with buckets. Then we could retrieve a bucket from it and won't need a region anymore. Then, to list objects in the bucket we ask the bucket to do it for us. No need to communicate with the entire "server object" every time, even though technically such a communication happens, of course.
To summarize, the trouble is not exactly in the name suffix, but in the very idea of representing the entire server on the client side rather than its entities. Such an abstraction is 1) too big and 2) very data driven.
By the way, check out some of the JCabi libraries (Java) for examples of object-oriented clients without "client" objects: jcabi-githubjcabi-dynamojcabi-s3, or jcabi-simpledb.
    Github github = new RtGithub(".. your OAuth token ..");
    Repo repo = github.repos().get(new Coordinates.Simple("jcabi", "jcabi-github"));
    Issue issue = repo.issues().create("How are you?", "Please tell me...");
    issue.comments().post("My first comment!");
Think of new developers hiring on or transferring into the group. They’re going to take a look at the code and draw conclusions, about your team. Software developers tend to have exacting, detail-oriented minds, and they tend to notice mistakes. Having a bunch of spelling mistakes in common words makes it appear either that the team doesn’t know how to spell or that it has a sloppy approach. Neither of those is great.
But also keep in mind that what happens in the code doesn’t always stay in the code. Bits of the code you write might appear on team dashboards, build reports, unit test run outputs, etc. People from outside of the team may be examining acceptance tests and the like. And, you may have end-user documentation generated automatically using your code (i.e. if you make developer tools or APIs). Do you really want the documentation you hand to your customers to contain embarrassing mistakes?


Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts