Wednesday, May 22, 2013

UNIX Command: AWK

Why AWK ?

  • The name AWK comes from the initials of its designers. Alfred V. Aho, Peter J. Weinberger. and Brian W. Kernighan. The Original version  of AWK was written in 1977.
  • UNIX has many utilities and AWK is one of them.
  • AWK is an excellent tool for processing rows and columns of data, and is easier to use  AWK than most conventional programing languages.
  • AWK can be considered to be a pseudo-C interpreter, as it understands the same arithmetic operators as C.
  • AWK also has string manipulation functions, so it can search for particular strings and modify the output.

Syntax of AWK:

                   awk '/search pattern1/ { Actions}  
                   awk '/search pattern1/ { Actions}' file

In The above syntax search pattern is a regular expression.
Actions - Statements to be performed. Several patterns and actions are possible in AWK.
File - Input File.

Example 1:  awk  '{print;}' employee.txt

Here pattern is not there so actions are applicable to all the lines.

Working Methodology of AWK :

  • AWK reads the input files one line at a time.
  • For each line, it matches with given pattern in the given order, if matches performs the corresponding action.
  • If no pattern matches, no action will be performed.
  • In the above syntax either search patterns or actions are optional, but not both.
  • If  the search pattern is not given, then AWK performs the given actions for each line of the input.
  • If the action is not given, print all the lines that matches with the given patterns which is the default action.
  • Each statement in actions should be delimited by semicolon.
  • Each word in a line is a field $1, $2,..... $NF. 
 Example 2: Print the line which matches the pattern 

awk '/Thomas/Nisha/' employee.txt

It will print the lines that contain Thomas and lines contains Nisha.

Example 3: Print only selective fields.

AWK treats a line as a record and columns as a field.

awk '{print $1,$5;}' employee.txt

It will print only the 2nd and 5th filed of employee.txt

Example 4: Initialization and final action

Syntax: BEGIN {Actions}
 {ACTION}
END {Actions}


The Built in Variables of AWK: 

1. FS (Input Filed Separator): By default AWK assumes that fields in a file are separated by space character. If the fields are separated by any other character, we can use the FS  variable to tell about the delimiter.

$ awk 'BEGIN {FS="|"} {print $2}' input_file  

It will print the second word of the input_file provided the field separator is "|".

Tuesday, May 21, 2013

UNIX Command: tr [translate]

Syntax:

The syntax of tr command is:

$ tr [OPTION] [SET1] [SET2] 
 

Examples:

1. Convert lower case to upper case:

$ tr [:lower:] [:upper:]
 

2. Translate braces into parenthesis

$ tr '{}' '()' < inputfile > outputfile
 

3. Translate white-space to tabs

$ tr [:space:] '\t'
 

4. Squeeze repetition of characters using -s

$ tr -s [:space:] '\t'
 

5. Delete specified (digits) characters using -d option

$ tr -d [:digit:]
 

6. Complement the sets using -c option

$ tr -cd [:digit:] 
 

7. Remove all non-printable character from a file

$ tr -cd [:print:] < file.txt 
 

8. Join all the lines in a file into a single line

$ tr -s '\n' ' ' < file.txt
 

9. To replace every sequence of one or more new lines with a single new line

$ tr -s '\n' < textfile > newfile


10. To delete all NULL characters from a file
 
$ tr -d '\0' < textfile > newfile
 
-------------------------------------------------------------------------------

Informatica Basic Interview Questions!!!

1. What is the difference  between Informatica 8x and 9x?
2. How many types of fact and dimension tables are available?
3. What is fact-less fact table ?
4. How lookup is active in informatica 9.1 and what is the additional option added to lookup ?
5. How to update a table without update strategy or a table which does not have a primary key ?
6. What is the difference between lookup and joiner ?
7. What happens if we select all the port of an aggregator or if we dont select any port for group by?
8. What will be the output of a group of a router where no condition is given?
9. What are the limitations of a joiner ?
10. Is Union transformation in Informatica same as Union in Oracle? How to do the same as Oracle Union in Informatica?
11. What is GK_ID and GCID in Normalizer ?
12. Explain the use of Newlookup port in Dynamic lookup ?
13. What is the use of MD5 function in informatica?
14. Explain Incremental aggregation with a scenario ?
15. How to declare a mapplet variable in parameter file?
16. How to pass a mapping variable from one mapping to another with the help of workflow variable?
17. What is the command we need to use in parameter file to use both session parameter file and workflow parameter file simultaneously?  [$PMMergeSessParamFile= TRUE]
18. What is the custom property we need to set in order to ignore new line character in between the data enclosed within quote?  [ MatchQuotesPastEndOfLine=Yes]
19. Give an example of Informatica user defined function ?
20. How to handle Event Wait task for dynamically changing file name ?
21. How to include customized header and footer  text for a flat file target?
22. What is session partitioning and how can we load 10 same structure flat file parallely in a single pipeline?  23. What is the use of Push down optimization and what are the limitation or preconditions for PDO?
24. Explain the different types of SCD implementations?
25. What is SCD type 6 approach?
26. What is star schema and Snowflake schema?
27. What is the use and syntax of Mass Update, PMCMD and PMREP commands ?
28.  Explain transaction  control transformation with an example ?
29. What are the methods of code migration from one environment to another environment?
30. Explain the complete process of a workflow run and task performed by Integration service, DTM buffer and all other process threads?
31. If suppose I have a workflow where there are two sessions linked in series, what needs to be done to run the second session only seventh time (weekly once)?
32. How to send a mail when the session starts loading to target?
33. How to handle multiple delimiters in a single file?
34. How will you recover an object which is accidentally deleted?
35. What is process of running multiple instances of a single workflow?



Please mail to jalal.jc@gmail.com for solution of any above question or post your comment/query here..:)

Monday, May 20, 2013

Working With COBOL Copybook in Informatica


  •   The data structure from mainframe will come in a COBOL copy book format (.cpy). Sample copybook content is pasted below.


01 SALES-RECORD.                                                                                                 
                03  HDR-DATA.                                                                          
                   05  HDR-REC-TYPE                             PIC X.
                   05  HDR-STORE                           PIC X(02).
                03  STORE-DATA.
                   05  STORE-NAME                        PIC X(30).
                   05  STORE-ADDR1                       PIC X(30).
                   05  STORE-CITY                            PIC X(30).
                03  DETAIL-DATA REDEFINES STORE-DATA.
                   05  DETAIL-ITEM                            PIC 9(9).
                   05  DETAIL-DESC                         PIC X(30).
                   05  DETAIL-PRICE                   PIC 9(4)V99.
                   05  DETAIL-QTY                             PIC 9(5).
                   05  SUPPLIER-INFO OCCURS 4 TIMES.
                                   10  SUPPLIER-CODE          PIC XX.
                                   10  SUPPLIER-NAME      PIC X(8). 


  • We need to add the standard header part and footer part to the copybook and save it as .cbl file which can be imported to Informatica as source definition.





Header: 

IDENTIFICATION DIVISION.
                 PROGRAM-ID.   COPYBOOK.
ENVIRONMENT DIVISION.
                SELECT SALES ASSIGN TO F1.
DATA DIVISION.
FILE SECTION.
FD SALES.

Here ‘SALES’ will be the source Name.

Footer:

WORKING-STORAGE SECTION.
ROCEDURE  DIVISION.
STOP RUN.
  • Save the file as Sales.cbl and Import to informatica as shown below.

  

  •  The source will be imported like this.

  • The mapping will look like this where Cobol source will have a Normalizer by default in place of Source Qualifier.
 
   Important things to remember: COBOL copybook will have Redefines and occurrences in the 
   structure. Occurrences we can be removed by multiplying the same column that many times.