local/global variables and temp tables ] mssql

In MSSQL,

Users can define local and global temp tables;
Users can define local variables, not global variables.

Global temp tables ie double-hash tables can be used outside your connection/session. So can global variables.

Local temp tables ie single-hash tables is inaccessible outside your connection/session. Local variable scope is even smaller.

local temp tables look like #table
global temp tables look like ##table
global vars look like @@IDENTITY
local vars look like @var

String.java method names to memorize completely

I was familiar with hundreds of perl/unix common how-to's. Need the same for java String and jdbc. Here are top 10 unfamiliar String
how-to's. Exact method names to be committed to memory:

boolean endsWith(String suffix)

boolean startsWith(String prefix, int toffset)

boolean equalsIgnoreCase(String anotherString)

int indexOf(String str, int fromIndex)

String replace(char oldChar, char newChar)

String trim()
Returns a copy of the string, with leading and trailing whitespace omitted.

topics in stored procedure programming

Xia Rong,

I think some investment bank departments have as much business logic in stored proc as in java. Here are some non-trivial topics that take years and many projects to learn:

* performance tuning, since stored procs often become a performance bottleneck.
* nested stored proc
* temp table perf tuning in a stroed proc
* temp table in nested stored proc
* batching large transactions
* transaction log getting full
* massive multi-threaded update/insert/delete
* cursors, which are less used in sybase, but i think my team will use cursors more in IBM DB2 next year.

Topics below are important to SQL programming, and consequently important to stored procedure too.
* analyzing query plan
* optimizer override (for example with temp tables in a stroed proc)
* sybase’s special features in the UPDATE statement
* dynamic sql. Microsoft and sybase support it. Not sure about others.
* nulls in Transact-SQL

Let me know if you need some books on these topics.

informatica high-level jargons: long learning curve, low mktVal

Now I think this dnlg is low-value, non-portable. A lot of the training material is written for developer mind-share.

(The type of dnlg is type 1 i.e. jargon.) When you enter a new domain, pick a new technology, you have to go through this learning curve !

— The learning notes I took in 200807:

a workflow consists of tasks.

a mapping includes sources, targers and … transformations!

many transformations – 1 mapping – 1 workflow – many tasks

A major topic is the different TYPES of transformations
– filter
– lookup
– aggregator
– sorter

a Repository is a database holding the metadata

code polishing

The ironic thing about the tens of hours each month i put into improving my code is, my boss and colleagues probably won’t say my code is quality code.

My boss is likely to notice design “problems”. Any design is the product of trade-off, sacrificing some benefits for other benefits. I think my boss can be pretty fast to notice the sacrifices.

Look at it this way — If u don’t like a design, u won’t notice code quality, will you? Actually this question depends on what you consider “quality”.
 
I have such a long-standing habit of self-review, which is now a liability, not an asset, because it takes hours but doesn’t earn me much
– decide whether to use private or protected when both can work
– shorten method arguments
– reduce dependency between objects, when most people won’t notice the improvement
– improve output messages, to help support job (like your job now)
– reduce repeated output messages
– make log messages more consistent among different modules, so the support guy have fewer surprises and confusions. For example, “duplicate” or “non-unique” and “already exists” mean the same so i use only one of them.
– give more meaningful/consistent variable names,
– combine and merge variables — A lazy/busy programmer can make 50 variables (instead of 20) to hold 50 things. Except the author, a normal brain can’t keep track of so many unstructured variables.
– choose appropriate error-handling, such as email, log, crash, throw/re-thorow, “continue” or “break” or return an error code, when each option can work
– wipe up left-over code that is no longer in use
– remove code duplication — Don’t Repeat Yourself
– clear memory to help garbage collector reclaim memory
– check for nulls
 
 
… Most of this extra “polishing” is perhaps unproductive in the eyes of my team members. I think this is another reason i feel the workload is 2-3 times higher.
 
Outside fast-paced wall street, perhaps this “polishing” is more appreciated.

eg: select-into, temp table

create table #pivot(
i int,
primary key(i)
)
create table #foo(
i char(1)
)
insert into #foo values (‘0’);
insert into #foo values (‘1’);
insert into #foo values (‘2’);
insert into #foo values (‘3’);
insert into #foo values (‘4’);
insert into #foo values (‘5’);
insert into #foo values (‘6’);
insert into #foo values (‘7’);
insert into #foo values (‘8’);
insert into #foo values (‘9’);
insert into #pivot select convert(int,f1.i+f2.i) from #foo f1, #foo f2;
select dateadd(dd,i,’2008-03-01′) as date into #dates from #pivot where i<=30;