Massive Technical Interviews Tips: February 2020

Saturday, February 22, 2020

Google Doc Tips

https://gsuitetips.com/tips/docs/see-all-comments-in-a-google-doc-even-resolved-ones/

Rather than scrolling down your document looking for a comment, just click on the “Comments” button at the top of your document which opens up a list summary of all comments, and you can click on the Notifications settings link in the top right to set your notifications.

This is really useful if a discussion has been “resolved” but you want to check the comment stream, or even re-open a debate.

This tip also works in Google Sheets and Slides.

Wednesday, February 5, 2020

Java Stream Advanced Usage

https://stackoverflow.com/questions/36255007/is-there-any-way-to-reuse-a-stream
.A stream should be operated on (invoking an intermediate or terminal stream operation) only once.

A stream implementation may throw IllegalStateException if it detects that the stream is being reused.

https://stackoverflow.com/questions/38963338/stream-way-to-get-index-of-first-element-matching-boolean
Streams and indexing don't mix well. You're usually better off falling back to an old-style loop at that point

OptionalInt indexOpt = IntStream.range(0, users.size())
     .filter(i -> searchName.equals(users.get(i)))
     .findFirst();

https://rules.sonarsource.com/java/RSPEC-3864

As long as a stream implementation can reach the final step, it can freely optimize processing by only producing some elements or even none at all (e.g. relying on other collection methods for counting elements). Accordingly, the peek() action will be invoked for fewer elements or not at all.

http://marxsoftware.blogspot.com/2018/06/peeking-inside-java-streams.html

https://www.baeldung.com/java-streams-peek-api
The reason peek() didn't work in our first example is that it's an intermediate operation and we didn't apply a terminal operation to the pipeline.

peek()‘s Javadoc page says: “This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline“.

On top of that, peek() can be useful in another scenario: when we want to alter the inner state of an element. For example, let's say we want to convert all user's name to lowercase before printing them:

Stream<User> userStream = Stream.of(new User("Alice"), new User("Bob"), new User("Chuck"));

userStream.peek(u -> u.setName(u.getName().toLowerCase()))

  .forEach(System.out::println);

Alternatively, we could have used map(), but peek() is more convenient since we don't want to replace the element.

https://mkyong.com/java8/java-8-stream-the-peek-is-not-working-with-count/

Refer to the Java 9 .count() Java docs

An implementation may choose to not execute the stream pipeline (either sequentially or in parallel) 
if it is capable of computing the count directly from the stream source. 
In such cases no source elements will be traversed and no intermediate operations will be evaluated. 


Copy

Since Java 9, if JDK compiler is able computing the count directly from the stream (optimization in Java 9), it didn’t traverse the stream, so there is no need to run peek() at all.

https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/stream/Stream.html#peek(java.util.function.Consumer)
This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline:


     Stream.of("one", "two", "three", "four")
         .filter(e -> e.length() > 3)
         .peek(e -> System.out.println("Filtered value: " + e))
         .map(String::toUpperCase)
         .peek(e -> System.out.println("Mapped value: " + e))
         .collect(Collectors.toList());

In cases where the stream implementation is able to optimize away the production of some or all the elements (such as with short-circuiting operations like findFirst, or in the example described in count()), the action will not be invoked for those elements.

https://stackoverflow.com/questions/33635717/in-java-streams-is-peek-really-only-for-debugging

Don't use the API in an unintended way, even if it accomplishes your immediate goal. That approach may break in the future, and it is also unclear to future maintainers.

Further, while streams guarantee maintaining the encounter order for certain combination of operations even for parallel streams, these guarantees do not apply to peek. When collecting into a list, the resulting list will have the right order for ordered parallel streams, but the peek action may get invoked in an arbitrary order and concurrently.

So the most useful thing you can do with peek is to find out whether a stream element has been processed which is exactly what the API documentation says:

This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline

https://blog.jooq.org/2015/12/08/3-reasons-why-you-shouldnt-replace-your-for-loops-by-stream-foreach/

2. Readability – for most people, at least

1. Performance – you will lose on it

3. Maintainability

http://gee.cs.oswego.edu/dl/html/StreamParallelGuidance.html

The streams framework does not (and cannot) enforce any of these. If the computation is not independent, then running it in parallel will not make any sense and might even be harmfully wrong. The other criteria stem from three engineering issues and tradeoffs:

Splittability: The most efficiently splittable collections include ArrayLists and {Concurrent}HashMaps, as well as plain arrays (i.e., those of form T[], split using static java.util.Arrays methods). The least efficient are LinkedLists, BlockingQueues, and most IO-based sources. Others are somewhere in the middle. (Data structures tend to be efficiently splittable if they internally support random access, efficient search, or both.) If it takes longer to partition data than to process it, the effort is wasted. So, if the Q factor of computations is high enough, you may get a parallel speedup even for a LinkedList, but this is not very common. Additionally, some sources cannot be split completely down to single elements, so there may be limits in how finely tasks are partitioned.

https://www.baeldung.com/guava-21-new
https://guava.dev/releases/23.0/api/docs/com/google/common/collect/Streams.html


https://docs.oracle.com/javase/9/docs/api/java/util/stream/Stream.html#count--
     List<String> asList = stringStream.collect(ArrayList::new, ArrayList::add,
                                                ArrayList::addAll);

The following will take a stream of strings and concatenates them into a single string:


     String concat = stringStream.collect(StringBuilder::new, StringBuilder::append,
                                          StringBuilder::append)
                                 .toString();

Monday, February 3, 2020

Salesforce Tips

https://help.salesforce.com/articleView?id=000334796&type=1&mode=1

https://support.workato.com/support/solutions/articles/1000236530-improving-soql-query-performance-with-indexed-fields

A selective query has at least one query filter that is on an indexed field and reduces the number of rows returned below the system threshold. When a field is indexed, its values are stored in a more efficient data structure. This takes up more space but improves performance when at least two filters with indexed fields are used in a query.

Fields that are indexed by default include:

Primary keys: Id, Name, Owner, Email (contacts, leads)
Foreign keys: lookup or master-detail relationships
Audit dates: SystemModStamp, CreatedDate
Custom fields: External ID (Auto Number, Email, Number, Text), Unique

https://help.salesforce.com/articleView?id=000335322&type=1&mode=1

LastModifiedDate is automatically updated whenever a user creates or updates the record. LastModifiedDate can be updated to any back-dated value if your business requires preserving original timestamps when migrating data into Salesforce.
SystemModStamp is strictly read-only. Not only is it updated when a user updates the record, but also when automated system processes (such as triggers and workflow actions) update the record. Because of this behavior, it creates a difference in stored value where ‘LastModifiedDate <= SystemModStamp’ but never ‘LastModifiedDate > SystemModStamp’.

How can LastModifiedDate filters affect SOQL performance?

So, how does this affect performance of a SOQL query? Under the hood, the SystemModStamp is indexed, but LastModifiedDate is not. The Salesforce query optimizer will intelligently attempt to use the index on SystemModStamp even when the SOQL query filters on LastModifiedDate. However, the query optimizer cannot use the index if the SOQL query filter uses LastModifiedDate to determine the upper boundary of a date range because SystemModStamp can be greater (i.e., a later date) than LastModifiedDate. This is to avoid missing records that fall in between the two timestamps.
Let’s work through an example to make this clear.

REST API

https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/intro_rest_resources.htm

https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/resources_record_count.htm

/vXX.X/limits/recordCount?sObjects=Object List

https://help.salesforce.com/articleView?id=000336833&type=1&mode=1

Please note that the length and decimal places are only enforced when editing data via the standard web UI. (i.e., Custom object | New field | Data type: Number | Check the fields - length and decimal places)

Apex and API methods can actually save records with decimal places. This is true for standard and custom fields. Salesforce changes the display to match the definition, but they are stored in the database as inserted.

When the user sets the precision in custom fields in the Salesforce application, it displays the precision set by the user, even if the user enters a more precise value than defined for those fields. However, when you set the precision in custom fields using the API, no rounding occurs when the user retrieves the number field.

https://justinyue.wordpress.com/2015/09/12/salesforce-data-modelling-tip-create-composite-key-for-your-custom-object/

In Salesforce, the Id field is the primary key for any SObject. Users can also create a custom text field and make it unique, but users cannot create a composite key for a SObject.

You create two Lookup fields on the Registration__c object, one for Student__c and one for Course__c. You also need to enforce a business rule of which one student can only register for one course.

If you can make the composite key on Registration__c object which includes the Id for Student__c and Course__c, your goal is achieved. Since there’s no OOTB composite key feature for you, you need be creative to figure out an alternative way. Here is the solution for you:

Create a Text field called “Key__c” on Registration__c and make it Unique.
Create a trigger on Registration__c SObject and listen on Before Insert and Before Update events.
In the trigger, assign “Key__c” field with the concatenated value from Student__c and Course__c Id fields.

https://developer.salesforce.com/docs/atlas.en-us.api.meta/api/sforce_api_calls_describesobjects_describesobjectresult.htm
https://developer.salesforce.com/docs/atlas.en-us.api.meta/api/field_types.htm#i1435616

With rare exceptions, all objects in the API have a field of type ID. The field is named Id and contains a unique identifier for each record in the object. It is analogous to a primary key in relational databases. When you create() a new record, the Web service generates an ID value for the record, ensuring that it is unique within your organization’s data. You cannot use the update() call on ID fields. Because the ID value stays constant over the lifetime of the record, you can refer to the record by its ID value in subsequent API calls. Also, the ID value contains a three-character code that identifies the object type, which client applications can retrieve via the describeSObjects() call.

In addition, certain objects, including custom objects, have one or more fields of type reference that contain the ID value for a related record. These fields have names that end in the suffix “Id”, for example, OwnerId in the account object. OwnerId contains the ID of the user who owns that object. Unlike the field named Id, reference fields are analogous to foreign keys and can be changed via the update() call. For more information, see Reference Field Type.

Some API calls, such as retrieve() and delete(), accept an array of IDs as parameters—each array element uniquely identifies the row to retrieve or delete. Similarly, the update() call accepts an array of sObject records—each sObject contains an Id field that uniquely identifies the sObject.

https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/asynch_api_intro.htm
https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/asynch_api_using_bulk_query.htm

https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/async_api_headers_enable_pk_chunking.htm

Use the PK Chunking request header to enable automatic primary key (PK) chunking for a bulk query job. PK chunking splits bulk queries on very large tables into chunks based on the record IDs, or primary keys, of the queried records.

Each chunk is processed as a separate batch that counts toward your daily batch limit, and you must download each batch’s results separately. PK chunking works only with queries that don’t include SELECT clauses or conditions other than WHERE.

PK chunking is supported for the following objects: Account, Asset, Campaign, CampaignMember, Case, CaseArticle, CaseHistory, Contact, Event, EventRelation, Lead, LoginHistory, Opportunity, Task, User, WorkOrder, WorkOrderLineItem, and custom objects.

https://developer.salesforce.com/forums/?id=906F00000008oleIAA

SOQL Count() query fails with OPERATION_TOO_LARGE. Why?

A Salesforce engineer was kind enough to reply, so I thought I would post the answer here for everyone to benefit.

I will summarize what confused me about this problem. Since it's just a Count() query, I expected salesforce to be able to handle an unlimited size in O(1) time. After all, it just needs to return the last row number. But depending on settings, salesforce may need to do a security calculation for each row, so internally it actually has to visit each row in case some of them are culled from my view.

From SFDC engineering:

OPERATION_TOO_LARGE

The query has returned too many results. Some queries, for example those on objects that use a polymorphic foreign key like Task (or Note in your case), if run by a user without the "View All Data" permission, would require sharing rule checking if many records were returned. Such queries return this exception because the operation requires too many resources. To correct, add filters to the query to narrow the scope, or use filters such as date ranges to break the query up into a series of smaller queries.

In your case a count() query is the same as returning every record at the DB level so if your count returns > 20K records then it is really the same as returning all that data from the DB perspective. After all, the access grants still have to be calculated to return an accurate count.

https://success.salesforce.com/ideaView?id=08730000000LhBNAA0

Currently datetime fields don't support millisecond precision. Even if you work with Datetime objects (which do support millisecond precision), when you store them in database the milliseconds are lost.

This is a problem if you need to work with a high time precision, that can be worked around in some ways, for example storing the Unix time in a number field, but it seems to be much more natural that a Datetime field would be able of storing such precision.