4

I often use the less pager to view logfiles. Usually I use less -F to follow the progress of the log à la tail.

However, some logfiles use national characters in a non-standard encoding (Latin-1, while the system uses UTF-8). Obviously, these will not be displayed correctly.

How can I view such files with less?

The only solutions I found:

  • Correct the encoding of the file (recode or iconv). This does not work while the file is still being written, so does not let me use less -F. Plus it destroys the logfiles original timestamp, which is bad from an auditing perspective.
  • Use a pipe (recode latin1... |less). Works for files in progress, but unfortunately then less -F does not appear to work (it just does not update; I believe the recode process exits once it's done).

Any solution that lets me "tail" a logfile and still shows national characters correctly?

sleske
  • 22,652
  • 10
  • 69
  • 93
  • It looks from `man less` like there is a preprocessor which you could possibly set to fix your encoding. – isomorphismes May 12 '18 at 10:11
  • @isomorphismes: Yes, `less` does support calling a preprocessor. However, as far as I can tell, the preprocessor reads the input file and creates a new file for `less`, so this would not work for `less -F`. – sleske May 12 '18 at 20:12

3 Answers3

3

Hm, apparently less cannot do this. The part in less' sourcecode that implements the "following" seems to be:

A_F_FOREVER:
                        /*
                         * Forward forever, ignoring EOF.
                         */
                        if (ch_getflags() & CH_HELPFILE)
                                break;
                        cmd_exec();
                        jump_forw();
                        ignore_eoi = 1;
                        while (!sigs)
                        {
                                make_display();
                                forward(1, 0, 0);
                        }
                        ignore_eoi = 0;

As far as my (limited) knowledge of C goes, this means that if "follow" is activated, less will:

  1. seek to the end of input
  2. read and update the display in a loop, until Ctrl-C is pressed

If input is a pipel, 1. will not return until the pipe signals EOF. If I use tail -f xx|less, the pipe will never signal EOF, so less hangs :-(.

I did however find a way to get what I want:

 tail -f inputfile | recode latin1.. > /tmp/tmpfile

then

less +F /tmp/tmpfile

This will work, because it lets less +F work on a real file. It's still somewhat awkward, because recode apparently only processes data in blocks of 4096 bytes, but it works...

sleske
  • 22,652
  • 10
  • 69
  • 93
1

It's possible that recode is buffering output in the pipe so output only comes through when the buffer, probably 4K, is full. You can try using the unbuffer script that comes with expect.

Dennis Williamson
  • 106,229
  • 19
  • 167
  • 187
  • No, that is not the problem. The `recode` process simply exits after it detects EOF for the file (after all, it has no way of knowing that the file is still growing); I can confirm this using `ps`. So unbuffer does not help. – sleske Jun 29 '10 at 10:23
  • @sleske: Have you tried `tail -f | recode ... | less -F`? – Dennis Williamson Jun 29 '10 at 13:36
  • @Dennis: Actually yes, I tried it, but it didn't help either. It seems `less -F` just plain does not work on pipes. Even `tail -f myfile | less -F` does not work, though in this case both processes remain alive. – sleske Jun 30 '10 at 07:59
  • Anyway, +1 for good hints. Even if they didn't work, it's good to know that :-). – sleske Jun 30 '10 at 08:00
  • 1
    @sleske: By the way, it's `less +F` that follows files like `tail -f` (rather than `less -F`). After some testing, it looks like `recode` is doing some buffering that can't be controlled. This works, but the output is in chunks: `tail -f inputfile | recode ... | less +F` – Dennis Williamson Jul 01 '10 at 08:42
  • @Dennis: Interesting. Your example does not work for me: less just hangs with an emtpy screen, until I press Ctrl-C, then it shows its prompt, but no text. – sleske Jul 01 '10 at 21:07
  • To me it seems rather that `less +F` waits for EOF in its input before even showing a prompt. Since that never comes, it appears to hang. Just `tail -f inputfile | less ` works, but it still hangs once I invoke Shift-F (or Shift-G). So it seems what I want just isn't possible with less... – sleske Jul 01 '10 at 21:12
  • @sleske: Try `less` in that pipeline without any options: `tail -f inputfile | recode ... | less`. Note: if your logfile is not getting much traffic, it could take a while before the buffer is full and anything is output. – Dennis Williamson Jul 01 '10 at 21:40
  • @Dennis: Yes, I tried that, and it does work, but it's not practical. It will show output, and gives me the less prompt once the first screenful of text has been printed, but scrolling to the end of text still makes less hang until enough fresh text has arrived; and Shift-F or Shift-G still hangs less permanently. So it seems less just can't do what I'd like to do... – sleske Jul 04 '10 at 20:31
0

Suggested reading: The section NATIONAL CHARACTER SETS in

Linux / Unix Command: less

harrymc
  • 455,459
  • 31
  • 526
  • 924
  • that or 'env LC_ALL=en_US.LATIN1 less -F file' – akira Jun 28 '10 at 10:03
  • That does not solve my problem. This will cause `less` to accept Latin-1 characters as regular characters (meaning it does not highlight them), but they will still show up incorrectly in a terminal program that expects UTF-8 (as that's the system default). I want to actually *convert* the Latin-1 characters to valid UTF-8, not just have them show up as junk/box characters. – sleske Jun 28 '10 at 10:08
  • @sleske: I don't know of a way to convert and do less at the same time on dynamic files. One can define macros per akira's comment for the several possible encodings that you have. This is assuming that your problem is only the display and not pure conversion. – harrymc Jun 28 '10 at 10:55